Irohabook
0
8946

pandasで条件に合う行を選択する(locの使い方と条件のand)

pandas の DataFrame から条件に合う列のみを選択してみよう。今回も東京都の自治体別人口データを使う。

import pandas as pd

df = pd.read_csv('population.csv', thousands=',')
rows = df.loc[df['男'] > df['女']]

print(rows)

結果はこうなる。

    市区町村     世帯数      総数       男       女   人口密度
0   千代田区   35830   63635   31935   31700   5458
3    新宿区  219639  346162  173743  172419  18999
5    台東区  118858  199292  101917   97375  19712
13   中野区  204613  331658  167378  164280  21274
15  豊島区   179880  289508  145334  144174  22253
20   足立区  346739  688512  345291  343221  12930
22  江戸川区  342016  698031  351914  346117  13989
23  八王子市  267736  562460  281506  280954   3018
27   青梅市   63142  134086   67393   66693   1298
28   府中市  125060  260011  130582  129429   8835
34   日野市   88402  185393   92983   92410   6729
38   福生市   30506   58243   29132   29111   5733
45   稲城市   39991   90585   45589   44996   5041
46   羽村市   25718   55607   28251   27356   5617
49   瑞穂町   14912   33213   16922   16291   1971
52  奥多摩町    2685    5179    2601    2578     23
53   大島町    4635    7716    3971    3745     85
54   利島村     174     323     175     148     78
56  神津島村     917    1898     975     923    102
57   三宅村    1620    2481    1356    1125     45
58  御蔵島村     170     317     167     150     15
60  青ヶ島村     109     159      92      67     27
61  小笠原村    1492    2625    1451    1174     25

毎回のことだが population.csv には下のデータが入っている。

市区町村,世帯数,総数,男,女,人口密度
千代田区,"35,830","63,635","31,935","31,700","5,458"
中央区,"91,852","162,502","77,241","85,261","15,916"
港 区,"145,865","257,426","121,326","136,100","12,638"
新宿区,"219,639","346,162","173,743","172,419","18,999"
文京区,"121,128","221,489","105,462","116,027","19,618"
台東区,"118,858","199,292","101,917","97,375","19,712"
墨田区,"150,855","271,859","134,678","137,181","19,743"
江東区,"267,262","518,479","256,116","262,363","12,910"
品川区,"220,678","394,700","193,644","201,056","17,281"
目黒区,"156,583","279,342","132,206","147,136","19,042"
大田区,"391,146","729,534","362,653","366,881","11,993"
世田谷区,"479,792","908,907","431,026","477,881","15,657"
渋谷区,"137,582","226,594","108,768","117,826","14,996"
中野区,"204,613","331,658","167,378","164,280","21,274"
杉並区,"321,531","569,132","273,057","296,075","16,710"
豊島区 ,"179,880","289,508","145,334","144,174","22,253"
北 区,"196,580","351,976","174,910","177,066","17,078"
荒川区,"115,944","215,966","107,283","108,683","21,256"
板橋区,"309,133","566,890","278,662","288,228","17,594"
練馬区,"370,567","732,433","356,279","376,154","15,234"
足立区,"346,739","688,512","345,291","343,221","12,930"
葛飾区,"233,158","462,591","231,272","231,319","13,293"
江戸川区,"342,016","698,031","351,914","346,117","13,989"
八王子市,"267,736","562,460","281,506","280,954","3,018"
立川市,"91,270","183,822","91,460","92,362","7,546"
武蔵野市,"76,765","146,399","70,120","76,279","13,333"
三鷹市,"93,665","187,199","91,624","95,575","11,401"
青梅市,"63,142","134,086","67,393","66,693","1,298"
府中市,"125,060","260,011","130,582","129,429","8,835"
昭島市,"53,827","113,215","56,384","56,831","6,529"
調布市,"118,804","235,169","114,909","120,260","10,898"
町田市,"195,643","428,685","209,971","218,714","5,991"
小金井市,"60,367","121,443","59,955","61,488","10,747"
小平市,"91,602","193,596","95,312","98,284","9,439"
日野市,"88,402","185,393","92,983","92,410","6,729"
東村山市,"72,676","150,789","73,621","77,168","8,797"
国分寺市,"60,111","123,689","60,901","62,788","10,793"
国立市,"37,728","76,038","37,161","38,877","9,330"
福生市,"30,506","58,243","29,132","29,111","5,733"
狛江市,"42,157","82,481","40,005","42,476","12,908"
東大和市,"38,852","85,565","42,208","43,357","6,376"
清瀬市,"35,454","74,737","36,092","38,645","7,306"
東久留米市,"54,257","116,896","57,066","59,830","9,076"
武蔵村山市,"31,640","72,546","36,177","36,369","4,735"
多摩市,"71,851","148,745","72,927","75,818","7,080"
稲城市,"39,991","90,585","45,589","44,996","5,041"
羽村市,"25,718","55,607","28,251","27,356","5,617"
あきる野市,"35,519","80,851","40,304","40,547","1,100"
西東京市,"97,350","202,817","98,839","103,978","12,877"
瑞穂町,"14,912","33,213","16,922","16,291","1,971"
日の出町,"7,383","16,732","8,224","8,508",596
檜原村,"1,181","2,217","1,100","1,117",21
奥多摩町,"2,685","5,179","2,601","2,578",23
大島町,"4,635","7,716","3,971","3,745",85
利島村,174,323,175,148,78
新島村,"1,381","2,722","1,325","1,397",99
神津島村,917,"1,898",975,923,102
三宅村,"1,620","2,481","1,356","1,125",45
御蔵島村,170,317,167,150,15
八丈町,"4,365","7,465","3,720","3,745",103
青ヶ島村,109,159,92,67,27
小笠原村,"1,492","2,625","1,451","1,174",25

引用:住民基本台帳による東京都の世帯と人口(町丁別・年齢別)

ポイントはこのコード。

rows = df.loc[df['男'] > df['女']]

loc で選択し、その中に条件を入れる。今回は男の人数が女の人数よりも多い自治体を選択した。このコードは pandas の独特な記法で、最初は深く考えなくてもいい。

問題

人口が 1000 人未満の自治体を選べ。

解答

import pandas as pd

df = pd.read_csv('population.csv', thousands=',')
rows = df.loc[df['総数'] < 1000]

print(rows)

結果は下のようになる。

    市区町村  世帯数   総数    男    女  人口密度
54   利島村  174  323  175  148    78
58  御蔵島村  170  317  167  150    15
60  青ヶ島村  109  159   92   67    27

東京都で 1000 人未満の自治体は 3 つあるようだ。

メモ: pandas の read_csv はファイルの内容を DataFrame にする。

問題

人口が 10 万人以上で、かつ男性が女性よりも多い自治体を選べ。

import pandas as pd

df = pd.read_csv('population.csv', thousands=',')
rows = df.loc[(df['総数'] > 100000) & (df['男'] > df['女'])]

print(rows)

結果は下のようになる。

    市区町村     世帯数      総数       男       女   人口密度
3    新宿区  219639  346162  173743  172419  18999
5    台東区  118858  199292  101917   97375  19712
13   中野区  204613  331658  167378  164280  21274
15  豊島区   179880  289508  145334  144174  22253
20   足立区  346739  688512  345291  343221  12930
22  江戸川区  342016  698031  351914  346117  13989
23  八王子市  267736  562460  281506  280954   3018
27   青梅市   63142  134086   67393   66693   1298
28   府中市  125060  260011  130582  129429   8835
34   日野市   88402  185393   92983   92410   6729

ポイント: & で条件の and になる。

問題

人口が 10 万人以上で、かつ女性が男性よりも 10% 多い自治体を選べ。

import pandas as pd

df = pd.read_csv('population.csv', thousands=',')
rows = df.loc[(df['総数'] > 100000) & (df['女'] > df['男'] * 1.1)]

print(rows)

結果はある程度予想がつく。

    市区町村     世帯数      総数       男       女   人口密度
1    中央区   91852  162502   77241   85261  15916
2    港 区  145865  257426  121326  136100  12638
4    文京区  121128  221489  105462  116027  19618
9    目黒区  156583  279342  132206  147136  19042
11  世田谷区  479792  908907  431026  477881  15657

メモ: 条件の中で四則演算ができる。

次の記事

pandas