From 8dcbd43a01e166573439a4eb7833767e67310f8c Mon Sep 17 00:00:00 2001 From: wangyangting Date: Thu, 7 Dec 2017 12:37:21 +0800 Subject: [PATCH] =?UTF-8?q?=E5=A2=9E=E5=8A=A0:=20titanic=20=E6=B3=B0?= =?UTF-8?q?=E5=9D=A6=E5=B0=BC=E5=85=8B=E5=8F=B7=E7=9A=84=E7=9B=AE=E5=BD=95?= =?UTF-8?q?=E4=BB=A5=E5=8F=8A=E5=AF=B9=E5=BA=94=E6=96=87=E6=A1=A3?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- .../titanic/image/titanic_output_24_1.png | Bin 0 -> 4722 bytes .../titanic/image/titanic_output_26_0.png | Bin 0 -> 13391 bytes .../titanic/image/titanic_output_28_1.png | Bin 0 -> 23542 bytes .../titanic/image/titanic_output_30_1.png | Bin 0 -> 14028 bytes .../titanic/image/titanic_output_44_1.png | Bin 0 -> 11367 bytes .../titanic/titanic-data-science-solutions.md | 3592 +++++++++++++++++ 6 files changed, 3592 insertions(+) create mode 100644 competitions/getting-started/titanic/image/titanic_output_24_1.png create mode 100644 competitions/getting-started/titanic/image/titanic_output_26_0.png create mode 100644 competitions/getting-started/titanic/image/titanic_output_28_1.png create mode 100644 competitions/getting-started/titanic/image/titanic_output_30_1.png create mode 100644 competitions/getting-started/titanic/image/titanic_output_44_1.png create mode 100644 competitions/getting-started/titanic/titanic-data-science-solutions.md diff --git a/competitions/getting-started/titanic/image/titanic_output_24_1.png b/competitions/getting-started/titanic/image/titanic_output_24_1.png new file mode 100644 index 0000000000000000000000000000000000000000..0a2b198ea35966b165d8fa38e0ee7197776613e5 GIT binary patch literal 4722 zcmai22|QGL`=1#GCEPYjA~~pJP(#*nOU1E7bnHZttA#21K9eL#I70RoTgMvN8K{XWlgqRdX{3#}JlkHKJs zK!c;^7z{R_d;D5}kNauZo8Ha9LWxNzCr*1I1km58hT@tvprq zdcCM&n(F<&RcE_GaR-i6S8r|{ao?lo)f-=)FRb6|TCTSHLrzySCK^|^`)YN>Ru{jm zX2uT*_A6&I6SubY9rS8T&j>n4%W|2nZ?9=9AL+j^z8o+dY7QB|Z&Tf3G%+B2?hF+M zROvWb@H~t+gx>;!cYsHr=pN8KVZWJP61xFZ18QXGmq16A?D-H27}5(YX^EVa43e$c zPj3J{49hOQj|X)u?&vorvJ1dUS&fiwmrY=X{!SDCrwv&#=u3nn4`%X@X`oYeR$Y}T zl~9+h(`WMToj1KG`Us~7PG4t_`8svuYzOb%GVNL^e}&27jV*5gJF}*WExopoE;b)2 zg64Mp497S;h<6EqchoJ;%i3hHqfz;~?iWUtjn43n1c_vJ5lDjBHH;vgf_JN9fC$hh zLobnAJObM#1n;Fs*MEMCv`EKvDRBZ%_fqeKU*~(1M z4(of@N25_^ke0}P05md2F1KLwEW5HW$v97}LD?4nP>4|*(t-qTsNUJT$_&HUTR^%z zmONn))beKesPDj0pqF0z}xD6M2idb%jHj;ERyv5*9pFjgSs}OF=z}( z6E4IMu$5D%qJ1Xvc77S!dHb6OJR!d*6iA;DJNWf(1*$8jL1E@Kk5WS@T-WdR9Z6I8 z0-VNW9&F0>gSx^Rf^Fv3$+s0R(2^ZYPu2!WD z8Yrz}Ju-DJui4^%XmS8qTIxJx*)@c=&elqFbG?D)^jKP8+s88?{IE`MHIgZRlv3`t zd>@+QAe!f{_ZWi}Nz5>DOdd~9vW&nyTF=$g?4~;!A z0Nl%Zul@t?n84!~zP_6y4w)Hrj9oko!l^F*k)BJnt=jbCEPZq{T8Y?ZvN5a!PTNa2 z`WrzyW$MlZLQ7|7@qte@fKcZy#r&RAT`EvTy`3Z8-U`yk^!U7lhtCnf5NiXm!AEGS zN2n0F|JSDYT(|~ky*G%1P_cx8R2SE(aTXFNa4$`g*hvK~qj*#vx7@yZF&@?GTFJnJiyh$%xJM(;`UcE zkHW&A+-t*+_d74*SbC5T@1y{^O_?V_?oI$O_oxJy$TnjMp)GYEF!ER{-?~EV3A@>>dt!>?rAFZb+7)&Y81aDEY%H5%TYOJs7b$aV9N+oRAlxL&e%$AW8M7Vo@KScgZ18R~TPcJ=h zS_h`?Gc`GK7HN4a&N_;uF;Qe)@m z9qnOFgqued%kh7dow@}eCZncSVC0P0Z~mp<*A?@R2W+(Ic=8vkzZRz-JExZaEZe6= zZv{O}%cSYcb#syOHGAngAi=0?#iK)5WW`kca&DN0d%o^us>Cq-wvl#S6G7xDXW=n^dwjirF5{f08r{%I78P`yyc%#_BSln` zmDAiF?B(7+{bJy!4gO-MA)WOD0_m8sWS1bS))s$q(pgKZu4%-U=0b-fp6xTab=Zrp z-9hAx5NNWDJRSJ!rb$~C;m_!7%_go}^KHd@N)g`;^u;(HuB%77Wf%F#?rlt2IligfYZncMtTWHnY7%P7OnBXEc*91xQ4$;}x zU%b33>cY1FM}YoD3%0=~i`u?d@%!k`pohT~D_96@``j%R$L$fMXeZXbN6{!XgDV71Z8!)Lfx4{f z7NxL$*hows?m{i28St#Fr`x35BL*p)ZpZ z1*0ufe+nOl1G3$}lUQlehV>bmYYL4mK)Sf{h$Jc5l(~>p3^%)yYFs4U#pj@V@YrFc z`y>Y_Yt&LV-Nn5sWN&G}nup!e4A0%UIc;_cS7s_!bCEj^?5<^AtOTI=hw zhiez=4U|CKn!$0^9lJ8!sKF06lUwd1>Q~)y{@P?Q8d8 zDmEyK>QhX|OFh;il9A;CRejBTA9-k-XQ|l`!P)oZa+Vm`9u$>KM)3)xqin|7Kf zU5#Cx9d(ufhNn!Z=7$$>q$wVOuPZonD9WcUBNB74x6ha`9Ar zY<0io&(@(Ci8c+?Y&>0|KDNp8-4kzY|38=c`@;cG9aB$yCyK3^;6D0ZGy?TNTMGWw z0vXLX5!}&&qpkV@3L7FDExA3-Xb& zvleJ(P~Izrdws1bMm+3|3x1_$MJ@bZ&(vcZSML7{@0Hm^oBl-_zwcmSfm>m|tx#d+ zOe;JGRRm+F_9NXx;a|=YX?5=ts7152LuxXh*S+3Mq-5Ioa&b=Y!2?7o2Phu-kFMMQ(FGnH$r4;KW z4sF~0|GmccyGE_ z5L*L?)3~mW%1ZAjewJO))XkJCAC)3paRDMj8Ljo1Vm!Og!Y#p$sKhX?~VJlB12SSMGHUJr!u*dM?8&Uh3gqB4VEM` z^M?X$Z&~q_@Qm{%`Hm*x{9=`J^**gQXYv+4oOf+6LdOqg^lW@K{}T24=<%%2Ib`!l zT;6d{bqbSFI7%73@JT{=Vfn89Ol(1)ed-r&@c_Y_DUkshE=M10>%p$*W7GtF3eElW z^C6vIyyK_-z1xtS3ka4j#V_;}kFtlrFZbiUtf4{t>bhzve*ymu)rpmD?{~d=xOhP$ zIaf2*c8^(zPzrH9_ytcAtZzU<)j9r^M$^ZgytvcR5NEsZ&JO_tJN>s$zCkoa7gR2J zf55z+>&sd%A*?Oc8|gUT+o?KL)f?Hpx)N60rG54Bi=0Po4@K9XN@%JF`obN}LuwIH zPf@5ky~AVK+-c^N0_&ITXg|Ks&2}yKyP}4;u}bF4#h}Y z0isuMetW}=PY9GDG=NDbaTYKLCkW*MZ;x4KYzPiN-X`5Zh}sNm*DcR8B~Kq5A^$x5 zM*B&okwupN!|GDCrijWOoDO~iVx$f zAT^B|L{*?P9G7^ff-a3~uWFTs8IAnpuRCeNI>E9)9g8xWuwha&zc^2+r5JUl>kIn; z;CXXFhhP}c3I9Js`u9Gl*%;O}HU}Mv9P8}Mu97kQf^Qcz1Q!l8C~Y-Nc*4SgU6wnr?A zPKEv zr^VZA29*pPlQwMS8bmt;4-I*<5}-K+sG<-NemI}qT7;(6k!Hg`3Hh&UkoO>f(=)|4 z0j2H7Smt_M56)11*2ylqPNpm#xkRGU@X52^1)r_@ iM#WsUS1ROOKsUiwdA_&r3urZDbdQfU84k&s+ zK&02>Fp80;AW}k!5r2e&O-m51v?ueuS=b`w9I&zxN?KUjPu; zu=)?v(i6UjzNmlcd)rILz0O_=IOTH&AfLMA?df&NljOR^|BR0>$?L)vO|{)>8Y){n zE?x3AP*?xs4QgIK?&=T8efI%i3jiM6cOsC=>Io?%_=T~TX6qz5KM1Bv%Sh_ET;^l) zyygF;bmF^vdQQ>aJ{*7TU-c7i1U0Ph{pCB$Q0;e{jA>DAw0RF0uaFAE&ufLmmeW~tV0^}5h2ZT zqSR4kA(!A~|7OmAL!G)J_Z+m+K{NuJtb9gQSPM#a@yzRl5(3vj{gwInqt-R;M(lrql_u8kv^>;#n_sdmj~O6Z*JeRxGA4cuQbJ4L{~;Ak!#Y) zT@FCd-^{f`1rQ%FN7S)|AV)}=4s%ayNgSE52XjgCD-F?B8$hSt@bc2FGzogKJE5$6 ziFxrfbVWkUjYxh9pQog$cyHQ8X>9!va9Vos8Tf(aVKwo9kX1O#J~GinaOUexWe$_- zWfR=$3~~<_?4+>?+`iU0eMOpc;jFVEYkF=()-ZvvqCg{q)qu=%qSm%jF59LRs+{J;(dv5xiZ47BJ6+`8hu^~p`vKlAu-OL`g z>8jJkW)-v8JWd_OH96!E9OngWfVF*D=tiBHZ7=56^bK}&Htd?y@OFZB#ZnWQJB&@( zec8i7#A#(@3{34RuGvuP{i;9$Q?r}Ad>`o|L>11)kep-ik?n;mIeiw-74+m1{A_pA z@>op2`8O&z*>$zfBaP%_gQie6t@Lxn$@(Z+WOI-wG(ul+#x~)?rAr^fW4w=HcJoAK zEmVo`5bMB|=8#vE8H)r7tV9XfnYZak0d&@^fHd$19=^Xc{MG>;F3?G|X&vRsGxxC- zv2rwIEnh=+QD)x}=4wCX!DMjkkdJmgf1p9*`NU ztyzqmRzJy<>K~!X86rsNW`{5>a~%gZA-a*x*2JPb7BA87)f8_m^^nL4HFixm#o?0U^r=j}q;n#q$eK+>AK}4E5lkT!WXPX_V0RW?MjIKe30(tU@HZbeH^)9a`cG z2auqcLC0~6q@>vJ{gVSVkpnsOb|o&qMa$AV*H7jXZZ0?u590yB3oHB-mY*v9 z*&QQ)xFBUf3@KxQKNKEF)NK*$chs#5{(y~GFQZMrLiBTn?7KR0FVHW56E&2GBNN+` z#FBe$iO%K|YDphkRgw0{l+-XIk(vXOL#$^=UrFrL@qq;#o;qr5@9~_d>m~=FpQ~k%O21E#S>yK_BsPa!bQleH*HiFse=X6C? zz&g*#QM+&Gl8=v6x3*?OkaCt{qPQCPzPWbr^}Fn+^a~?xr7`9ps`yPf0d(oCqQpRvXUG|n(#gDlvZ*akm&^2Fq-3W*|qpY76Q-tXFk zO^2Hv!}p1PTR1xzlf-E$#$iUL|FhUgb3XBhnTn^X&3m3Di~yCmKy!#riWW_6PrF07 zy=BrdNuz$yiRlgx?_|I;%d*IiOCn&)cJz^H1NQ;N|5;=HqV)VHFDI==B7tC|T(*_g z2=?|A%inVggDk)gN}XYxkB6nSeVx!?{s1iF^m1Wgd^BLKtS=R4>TOMw73{`S4N5f> zRkxz=q38=x&KbRJ8v9_$D}#4OIz97sx*62-#Dt6oW*(WxacW{(FU&8iJnS z22{|y{lIu_*TqiP*V#_GoG^JzH&Ai!zpwuPNeaHU5a_+2(KO~8f^=**m04IE#wNro zcfRi(+cW2_D3qkH?sq_40iZ{gShsCO-XYSlGV7_r4nGp{IGe&r=v5l}287DHmq>mB z=atN^Z0cAyil+h&Z`J%U-k;JhfBlW52mH_u1{bLiIgL~ki+Gb{`SPu5uEc6P3{+~D z#m$zxK*p42zzc@u2}9FC=P7#tyo@)p-}Rn0&%olG-pB-fz+s@V&Xml*?Va8biDwJ# zg0|u17@M}2!W6q_c-{t#aiX^~$OuG`+!< z+St_<@I*RTpT}E}lZ@y>4t>Q-7>)_!S3(xH594H#@ZDNY*OnMOubKb+N<=f;OAor< zNR2Olu?uq;dcRK`fCzqfcqs%>)(x`-JchRe4*m7Z#w$|k1X6o2muSL?=W0vIVE$#k zCJ1}MVZiF>`y;;50D*f5dJ0}h`cNDI)yI0C^O}|NOn~Bk{zqa4aq?IbgQu)ZjGy6P zXQ-9V4VW@D0uOkb3@je{GJSkuyD#(9rAB@1QKB~jKpLn^FR_Cwm2(V+b*xM&Z69|)ay+-i%-biSoQf>5Hq`t7J|;=nAy4)eD6 zLQm&qqMs%4DGU>jA3rXz=nGwMpvEgT?62Vq+~KXssh*jEs=p_YHu@P1E`(|J5~$r- z{3m}b1B z+NjIC+#KPw-u7GLyQm7N{#;9Gaoe3$Q7gK3>Wcz?uN=}mjU<0O7O#i>BNuYm&HK#v zoG`XL@Tq@e_Nyr6%G=T57X)VURTp_yJ82&t7r`JuW3a_lf9Qg_)5tv5ImK^YNGzFg3UMC@tFWB7BnX|!-u&orsWW#~(t$#JqDg4St?*0B8sB%KF=BV@Ni|d_gQOQ`D`=T+Ho-0zfP>rm?Z{n0iXg ziMGbDn@3s>?H+tcD|jq`A+`-tnx}Wf$p)J^Xnaz7ARqe{pD!P#MWM@ugJ!69z+R7=B<`8w4;1$9 zEaYlYw64pxV;6k*8;e@Jc&!V`yD0Wivi2g+@-o4=eu|&AM0lBmtqI*<>Lq_5mP78E zw0~3-(>TL$x9O%Q4;<|ALj>x`_6_gMe8cB1CbJbZg?0^^x$;o>%g`A8amuDw^|%M# zogVdMu#f|+MX|_*>~kT8ro!~kPY@l3SJ#6^-&mv6-Cc6TX;i&Y*nmAH0+V}YvH17s z{B-K`;^U2gv7_sydy1X7IYG`Nc>YCxe_|d%owwM!t*ABG-R{!+l6oITBH~PJyNifO zj(@C4?CQU-4B#2>6*3EvXz2)1K%+5Gtt8coKcdu3aW?3TWpV28A zS_!LNNM?ftzl4SI=KI->Pk-6WI0KGla_^V)dzfq;O?0c9-uCKLg(shmP20Tx^0p+| z-}J~a5-g`5slOz@uz`~h9OQz1mJWMW+RZ-~4D<=U%j z{HIRk6u*>d;BM9IGi&np3GYxgJz~iJ=_Nu&N+?=Uvdu=H!sQzC8*U@htSQ%|ev%Ea zx-O$+M~vn#spRKYt(;(-WB4;#k&Gx=XOVGvjHgEltq&V=n?3{d3WbW5^3QGNe7V>y z?mP94Dj!J;mA_5etj}KSywZQv=h*^6aeuqW{AEcpK0M(9jl_1}W43~FB{$FX-Di@` zCHpKt1zMo8duSIZ+(+{nRMz*8n+c1}G50wD*+ZeZG0#$xnQu*U2JaE%hiijV?=z*r z3sw_@EGr*4qha<&^X#k=_X30$ugJZ4c2(LD+eC?en>nW$Lbbw=@qK*DdGP!hMogsd z0QWEPz<^})mZiUe2kbBQ-C_P+JP`H}9y{+|!3V8u$MiYZ1h&{cHo33bnrJ`&Y&8-ZR$v@X9b0{uDTyQ^ z7m0p)-d!qIWu~{V?{oBJOU5%wLfrBFM>Gw6fY8MdzO& zea+7+IDN(jE74>-vyGwqsIg{)Z8G!D%<^v5 z$oLjc(oEWP1<9YM4)p6V5FoyYwZcP9WA0~sDrRLPiJ*1M=pmg(2RmYw zPj#d+cTbY+uag6lm2i2E#Xy_3;foPS=0olAcFT&0t(^6YFm3+j{JHFN_K+bYBjkxq zwVdp#6b%&jwn(ZSkKrxGHuL4eFqz!<0c(}4II}BKA$WAyvu+`~Zt$FDB0Dyjea;S; z8Z#rGnLkeIed4>6p(n5R48GtI)fkNe4P8Qz4rc>no`AwAWhlpsu4X^&sws<8 zBAuPzOANbo_B7f5*B6fj?bHjf#$0jl)aPG^y(`SuXQLSTnye28k zPiOsNx@krAmDo8YPA3e{6jh;$D&bw4*wDu`-wH9K^KntjPvKO<46>GpyqY6__n!8? zw+FTQly&=x4iNRHH83I#Ei-})9mWmTVlRtTK`AMkW%JHp{BdgdCNFx#sjlGMj3-WM z>oduQ(t)kpPMZLwvVmhV^(i7UhUNL?{v=PTCRw)OT2l;TP^=)q*edjn#>nix9SGfdyfWG%X*_dY+b$T$$zQQZ!h zwcg@eH0FFdRWnGvu8W2=g(fFa)d@j&l%827KI@Rjj*xnP>q`e! zD5hkqd46wRZ(O5YNuI@o7~?|Hb03NR6zv<6@vQu=8C3Sf44Mke9ferhX+RT7;()FP z4bmiNrdq9C!ih#>$)|+LVw{%;=1mDg|xih%s&xr;7;eUnEU>u%WUY zL!EV#97AoxS9L8Lfl1?VnBTqH%Cb63Uv#_%7h_(K_$ZHkax}QEF?#s8EJ_cOtajQ6 z>2CyaXRFWSYk0qQdg~Zor0CCJQs_VhAc8@Aa4>qU6E{BzB)5b~ zOG|T2)z=W($@Ye@k1gwi2hKK`XVf3~o2~q`13<8gW0n&}y0}-M9IJvIv@Sx0h}sd* zoQCp&zR({4OPr2na>O^W+SQB@gzj_IEJ{dw9w)c98~S2bM9kHlTs!+RM-WH5Vp4{{#?S+2^sctMrJp!u(%@#Uv?w_^hFoH#;B ztBitCp6OOlYTrPG_Nln$s@KgY+K32|wPZ-in3sn#e{?}PO5=YclK!%OeN$T6tC83Z z=31g9Dmk{Yf^52-Hcv=~ax4lK7`6AcJq9Y#VBP{2@!k$ZSz#GWj-`(_C!CQ38*DBW z(ev{@b@=3d^gS3~aiA?QJ*DSq%6dKF(#PZ)G#gt)51%}?dI94U+#NSNXe9C}@)@o3 ztG!dj>$5I3FRpGUJwbbN=ExJ^F^tb1Ugx6dz0v!%=gr2M?my9-m=i6~YSXKm=Y=`Z zwuBH-^}Lz~k*7RkT{PdOCoqv%bPoO4{p|2b*Hs-H%oRsYK*sci7+xp9lV`0hF_bKI zJG$@iNrTl#LK)4~O$aX-_im%0gmqgvp#(@fJR!$0U{qpW?H~=I8VcWFhZ2q3$ld3K zefxYpkGAlj&v1crnDQ-pe?zDK`xJb=NTVG<=rpC>2X&On^l2pWLNBOhRq&*$=lq`$ zPBG5wDSUvE5vi-Bzfn)N^cR$<<$4S%-Zow3H=lh~QN~G8Wi3Tm6o__+=wbAV9Y?V! zYLUg=IJOqT@k^|t_Oa)o+zH>v+Q2#dViYC0*sE`O-rab_&OSi9$qqOY&)my6;k^+KoolBu?Eqd@MF?k$K?Jqeo*Q?=$3xhF$W(UmkeUi&<|g*wqUT zj5CH^AtwA9G&#hbqDyQO85pnp27#Q?lt`~pO*|^2vuDrrm7$eIjNJ5Nv$pr|-`Bjo zCAgSV6~9NR^Y&RY&Ny6NO0HI>{X8?i3Hi{OYES$Ii1svBACltpb&@iVN#Zs~bm3m! zTH&IsXv_cTPX!$c(!CXLZ(JosPE8m+zBvnY#*phpUj_o@y-z{MBiVqu{E?raxC9gc z>{ai>&S~CBF7k;nhp- zE5}(SK)75fDR`l>nr}M@{m?qYqB}>-_5`Gbpr6=N20&|k2|8aQVGAX=L7h~AVk5K# z{Womm->;dqZL0tlE0vK>(!Z^gd~`GJT-&gVQVcE{>!pUB6$@} z<%`8BCNl&5X~kFXKEBdahw9&6Dk?Llzh^9HC0PM#9&D8Rbqtmggl?d5somd<=_rAV zfx?o7DRly<@EZMrBkECrxFSu;MF_#*Wi-B+>O%9zRQ5;A?SgB}#oXRj#zL@?v-))H zeluVUEWo1bsr)%__AnJbNhD^YYe#(UD+RGgxE2jKtfvZA=2cZ?dWnKk-w!Ne^-gB} z_~Adm8&dfxVN&CwS9 z-u?rX!5{gr_n+Ra0GMG)Y}#ar)2A875&k89^Okj*gAe%&60k0&jEdF)-4ZgS6sxap zapxtKC&8|3SX4j}1%a$l9C#t+bd?o$%<-T@hU*%QpxDii$C<1|hUa-H2lvSRK{jv^ zF}FL4CH)k7VuC#DBL^2ZNP7G)-}qm?@!#={Hzk7g03epW>IL{;*75%Z>(JXuyG`zv zhK#(F>mBbi5xcImUK9T?=r-6iblNk`yue-lLu-7y0d1I2qEu!~8@lP3_9~fcGR|GT zOR$gaF6{8TttCL|U3v^hKZa{5@;I-}{(Fr}v3iwoG_b7fJ!X`)GKkobYeX>1t|nHN z1)lCP#5U9zupIPC>pL=4p3jL#W__gr?LqDB!w1ljxJPwK6W&_H@TXJH0ow$1OwGDG zOh0m4*a0Y}(LsaerC9bTY*EY51SA9nEew6SPH^_5r`}}_7h`L7(_Y-4hKbIFn||2S6enUx zxftuF->CdD;?RU3;|L=GF4v{?J~#UO37(Ij4_4##?msFBSLXZ*&-WHSc(w*RVw5{& z0}pms`2`*x&Umh};v8v;nZY<|Hjkz9hMLf9C=oo%kD_)ik0Tbu>86>Yas^s{ z_HY!kE3!x{Ze^pl4kwxs;f8Na~g66x^S?Cm1w19<+R zAAQyh|F=hClD`}_5eO}|cx1}wq!BZ?Yys7KTz_K;gd5U6`WmtJF|=5dCCXepqlQrS z)7hx08OezLa>4>3XuSq+w+*omoCrbC$1J)zhMFbj^cl~I(_R3m)&q{p<4t-`+i1~u z28ZS<#d1x`xVdqwo}2e__6tu4r^X^S4&s!p+k`+%%Scvi6~;3>?}}u z_iQOE37&7aTW-Fv1Ko@^xQe3LG(k_2R~kO_ z+S#mpQx1;N(dnd|bgI!X(h8TC@Bv**nWVgao33`B@98rmNU%$ce|XJ|U#OC|0>?JV z3ZLv>`Qc_@N*pc#WzZEkU7&gMOh^+ca4nK1s*i9aVQ0amC7m3aDCPUdFsdLW=JrE) z(uRe5k~f#OHcBs~MS8uhyn$IRG#*mcc4zkjRN+b;niZuY()6U;Pb)%!juqANn7o{s zl)en-Rm(;no!gfOSHyPXLMvF!7TR^6&k~6uN_J+G<0a1{B6r zcZG63VHick>FJr{@20i$hT;d_rKP+qMTaIWg6(QJlVe0(b^od)m4O8}rr}=5s^wOb z(GJ>K#>x|Ta{k%aMTlEvs(k};;H4fT1_*y+A!s+h+NG~c5{v&~Xqzm=HyGLw@Nvx) zL0g+(d7zGowrm@T#sGs9DwR+7wxDk>jRZd8wQIpLdP730<*sG&qgH~yq#?`Vd(%?< zxMW|o$<=P0=k6h>5FJu}n|U5>M%@u-z})i@!*>aIMY}Y&<4_UkFYkbkEFqMDRC8|u zR6b9t&dN7AUX&IIO6^1~2;Rn<0MHt%j%gwf;#Oz*YCh-)c?&M~)}p9mFFMbz1JuoZlsP++9w5&njblekqmjdm+|E>oKO|dQ)Ez~kx4%rt zqN>WFA!utlY=G0lZJ|9w=F4=LGy-(4}_w;_E#BR)^8nmd=Jx{m;5=K8F4|9@`u|H=3F zuP4ceDGG10c@IN;q$YGQ83X$GLDCR9=kFme{K@wspPRW4_0L@R^7%U?0;nUleMCeN z)O~|vdu4BD}aHznV0p_MB-;DICVnw|K7S$*LgM;(YQfZAf!$dU{cX&2i6e!WI6Qogj zTFg15s49utC_*RarCf+TA`ZMVLaWoCl6C*R;le2wmya#oLeosT{P9=#WIR)xaUAT$ zIpC?fh~8II#;S<~E*I7FiJLL>h!#4Wv}IM0{(0W~FN~J|GT{F{J?ZGZ2n)((nzUC0 zU1j~J0lb2u$evZ}k3R!l%ImzfIj;y&*esmQiMAKWCRSzUxcfhH zK(vR4M^I?OtAky@C<79OHyu&u%Qu@z5K6CpM31_g@^eSO{pp=4CoU$Md7``z`L zo?65Ac>H}me>jvm_Gj(-U!*Gi?vg6SDfiUM?`N8J(0~l)(8OBDr~Q=t%@Wgrx{fxj zv6&x$CcFtx=%thNA!HMAx|)zxGRgiD5qNX(&}D6}L9JK%>Q8jhNStVsh-5c=m|#lN zt0FMou4(;{Heq#j2^jNx_rYD*x4mPnHLE5pG;|s`Qxrn|K`62?L!XZxRyXR_o$F25e!>fSXiiXnn5Ye)hh|eVoEpaNd`KH zgn)h9Anl*|;wHn__m(+U={SZvzL&2G72?DqHj!D%omA!tGy&tF4yLYT5>le90+cnl zGMr*GwP;eejl{ER&bT#-&LjExB7RDDRxUl<|EdqWx&u%POD5*xrF|IF%sRQ9* z$8LUJ_lGIw_ViXtc#2E^4FmJ6_V%ui=Ae|ynziqrt;_#oM|tGb@NK^hIaPo9(k{Kw zyz5irg=fxQW5XjK6mPCBNCO(`i?Q+vuDJ|pC!vS5k4hlw3({oNBIC$eMiL?3x=nRy zy=Bjya}w?ot#9WB79Urkpxf1{O+8T`t@C&$UuD;@m2#ePC06F1TL9YXKKuitFhnVES1L0g zvYv7VF@uK(s=rqvWuoRaH@o-ODDbc^GC04nUf5V(ZS%$0w=t+Oq$8$Ua&MDJl~tIl z=?^!TzU+hLE(*sGz$EWw@zvaEYw{wcPOjRtWiaUPpzV{?C3NXWYd7d`?hKcK5kdRlTNwgB3Tev9i zJ$&?+q%@J1Q1zL3N-b$#ONXvN3rEsNW#EOIc8QY;wW0Eu1NB^3YM#dk zU+USlD2YhcMxT!iE4w;ya;OBmtE+eu^0YSkX;gt zr^Di^wszH>4dFcBHO8{={bkI6ajm~tb>K^ezUg9VvF18Vk&j|coN$rm=(64pl!0>1o5?Zoyd1|J381JVH9)&& z__yf%mhTgOVm&2QZMVT2XcQC57kHew8tFZ1a=FzXGIHvb{$L|&ipW_tMWkn$--?C_74JsZ znK%(8N*Yl&*E^GHY==S4T<|49P5~jHW$_LDqITMMGe-*ggGKxE9d&TY+2_IxBSag8 zU8%)&0u(Qswqh8a4({eSyz;Ui{aN4C0d*8f9&U;`jL|5L-)ymgVdQd}80OM-ARen+ zGHdln-%_st9=-*iv?JnMDwosp;C5im8hQJG=@?&iGINOBIvHudIgd3i%lJZwIp61O zIIm$%cNHF3{4SBL;I5}{IvOkgD1~R-6`ifxBZ;-q?s6ARW6s(R{J@y^58mB4Ubu^b zuFKxqXg|3y1998--iV#vd%O)fB*wei=)9=87(l0x&h(0`Ld`$`o3EO`nQ&M6l_tUFQ?!4@`>1=AEO<-Dj%P`yM*QGkrnJbMhh9kUF7EhjzwLt!M|y zC}A=QweHd_Gr6)XI)$wqq{NX%B9RC_C#y-Zdxp7T%Mu7koK``lTS$~>VJ5HE;Pi2H z7<5orc)l8w#Qho$5PD#PDk(=#BW zy8{x4u=z1{00%d1xVHUCOhZs3b1P-cfPc<|xReDOQDN4QO|%cxi*up->hgM$34PNFDNncP={yDiMAM$ z{}#5D53W%?-8f+A*ZYNs1}f!VIpf|d5IQqpIgcp8Jih2B0|6z97F&F)la26upWDT| zS-Au-VtPami8WK%TfP>E=6`LG{8OKmz9RpY(wK?p#J)oR_y+*Lw>+4;-{tcE0k)!p ANdN!< literal 0 HcmV?d00001 diff --git a/competitions/getting-started/titanic/image/titanic_output_28_1.png b/competitions/getting-started/titanic/image/titanic_output_28_1.png new file mode 100644 index 0000000000000000000000000000000000000000..688b97a10b92fb8d07bdbb4b843d08d401a96f38 GIT binary patch literal 23542 zcmd43by!s4+b%kQlz=cGT{@H!4vn;N(qQ`GjxhHICPgv3?!Y z*I9hO{hfWCea^nl`SWmHL(N+6TJt{r-1oD>-m1uv+^4$_0)a^6-@w#CAS`jLRP-qGA8R1!3Z3 z4ze|Iaj>y>v9Wyr$j#iz+0x$b5&tv3XT02xKDxL#h(V$M^MGgeP8QJbw(|)f&?AsM z?4_oA`p&F}yXNs&+reH+$_}C8qZb5^yg-v-b=iRgZ%W3V>1`jjRFA1QPEDLt>y^}K zan+1z88+6L>Wjdv3rmPNpJYEVG3scczz+CfuKHo{vb&j3riWiXSt(XDYdwWS03nI9 z=&*M^>lIzViKZEu^MEk=^P#KmDY~JP3ygaQX(%YUCuvP{1KGq)l{9~tSwx22Pu!V$ z0zQc7L1j;5MCl-6Lz^alT|*xti`h%3cdi?!>tDgN!h5X1laC6p_CkAv z!SzBP|2%*klAeZcKp{(x6StL(bDrRH^Qhf9E=WXxbzq(sY{diY(RM8{Fnf9*>;Z?x z&Tfv#ntg4&Ng8U?R&GBZy5RLkB^GYw3H&e^QPHiiZX~x#M;p=$+Q*OE zPtIGySmQ=0tPglwxB*cyC9)Mu9FYB%ame)_>!#3u&Z^g?xKucd4&+mO;LQi# zr2)1QI9K@YaF}mM+ttn98}?Y6pfE-@mx3L4EeVBplpaRZ6#ZW5&9DJLVr+d~Y0QI)IJD76xag zL`JvG`_nr_7aKas$zfsC#s)Bj2d%OzWGe3&=R$R?S=tyOhxgtF#(lH5S`$_lXV+Tl zrJ*3%5gh-2R^&)NwVYYj#?6s@uED6HlL~FYW~I<6UHX~Gm3xY3eX25tP3;Ph(qXpJwftUuZ`53@z>dMMOxL%to{UcaM&e|AzN*+n$ z6FV6R_)Fu2zTtFWc6g>TP<;ve=1j9Jh=v9HU=xR56Fe!fKooZoHkmL9*!aGYO`~ zhgM3~6uo{4>ivQx78jfJWGRa$iamovL(*T{g;sPn9eq#cIwhJmR2ev>UOCDVipOA9 zF{gf=2g4LRbP;@cCP*bWuYkFiSzcK6$4b&sD5*k}E-E!4F_Eg~g|&e7W~GgE#bwpz z^nJGjzpO#KKRS_oBoSs6Y4}&Q;+$Sg4odJ-oITJ8>|3QcFW*4MxYdqWLsB*VmYrF( zm^}{ju1?>a>7}TiQ{;U=S)$hy6-+wMBc^YBm`S7g<&s+)JIsXYF?-#TOO=qBn=%cn285p z@G6I2<`d}s^U<0PqjqGhDZ^9I@dBD=Ci~gm@-3g4%@J}_aylg~=DoqSa@9=Ppy>50 zx`o13s8iKHnhE08T(xq#5pwQHp_l455bdmM4?~@!7`=5P=|j%ao^du~N%IKI&dOzS$E2d{lZV%*<^D;FPq@Omv*!Gj9tfDP`V{U>l=Ww0fV9 zmqenqSe9MTREL4-x?QN5nRtNND8E0pmCnXeiKleRE&ieA2tO#w8agTYFx3{3R@K6{ z_`bI^4ECuI2f8$TLvhyL>ZjuCh&3DA@Hu`Xl?6>Nlm96?O3DdEeD<`|QOv+AR>I?< zu8twh(L+8%&e4@n2-tyGU%+Xbv3(RpO3PvjkI&v!c>Fau81AKQ?g$NfyRHcU^3#ELCx$OA`;V zJFjj+H}c3U^oaP4;?^rqCz0XU2AEhp)xmBtJx%=7Y1O9T0k*@p6`E5X_jp9D? zk&@u5BqMZjG55gWV6+R3S|{^<<@;MPrdtM}T0W2^s6I1MM9#T+bZ^R4f&vWeMdhf-^SH#(;7jN=VcbOq>GNP5K$}RJuFsw} zQV&%wR=a~~@rm{lv=YIaKdI^fxOgwC4Q{6wD@*^j>2drSZ*WpIT1u--h;HAtTz2ILE zgU`P0K9mNb3${dNhk{Y53|lulf;GDZ#AoN1TNCF)Yow_Bs4c$8>A}kn$8mJ*W^Q!! zxeiQ|QTR2eEjKt1z9HH&ODAdDKdFNz#*)pO2`Um*|Cu@SG-@j?N_*XQVep)^{HP}r zZ(KOfvLj{Zm0j2qhy~WQKWQL)cVk9pVt5aACv{#}Jkb!{P>=>D+e=*=^ni+S4e+D= za^lia4rY0h_bn=j>e(RsLdK=2i=ez?wK47m&jnHl|3Vcb>_ zb(e=|mrr1X`*_Y10^umQ6r)E?J!W_*6g=xW+!C)9f4FY$w;m_iUzBe!bNcQ>WMd2b zbT|ssc+3hr*5cF}dk6y8Gxy-!d_D^C!uefyF^3j><*g(kEq40>FYjb^&)QH9e*SSi ze@8U6KL&awJe&LNElSk+RYRc;ObR~cOrJ_6?`AgPTO$4gIq!F<PHa`r@?&lCE@8%frgaL`&h$;A(;jw&>E+K1GFvzBITv*sp4& zSxb!IEa6^8C}uAF6tdT;f1*+JHqIay<*Q|yDp~gtf-?2S_m$cdE~W9dJgyRPnp>P; zmNb-fvQ|oX8#nv-^rbg08L}wg+9aYcnao4d@W$)c*o@naNZ65>2z{C8m(DxaauPTT zjvlOH&zZBBV0I_313kyy0olDBGgu^nx(L58abuqC-b<8tX_%vd#et^ptVJ7W&+g}E zRA7aZ1*h&bp|+@ztZiRW939Mh%Q_VTAlcQQSM&Lmy2!>F ^aYUhy+>wS-HALD|O ziaW`unj%7b;@Uj?4T$+lw19wzc<0V##ORB=v=#uk-YzKBRc2u9SCV4esY>AEvhOqP zOQv_7)`h;KY55UpDHh{icd;{cG!oGw^Ii!Xltya5K8Rq3aQG>H`YM1VSFg|*BZp^v zMH!)==1yXr@fIU#N0dyxqK(%^#WzR(Y~ndo{d;~2J;SlLH8hS%kloS+qJDf1eODMH}x@S;g(_!tC+T_ z{!_e0Vjz{rwF^umRU&mXFjZlJcuO6rV=`Rm8JE-5BO9YJtApnPy48L@z-I8|%$adO z5Owy<50Sb;MkF@kqgvrcV5AU)UmJ#^`}b+)^Pgli&A2VEK>qB{?9mo~=b}Ct=4<4H zNE6(r%VH}3vzq#&_${`f`WQdBp3Z8AUKO**GZpFgN2_M_82Lx9FiItL`IyENcUnj4eZEDdR z3y>lwUAh73sL|0_TMfO<_X1Q1y!zMm*+brEqpq{ZB?-Bfu{8F?0u`m}5NNaBsJ;}kqg*n z+D?aW4jGM8r`5{^cn2~*{Gt`Azr6<-W4DX^4sqKJ1%CZ@bBJz(h3qF50Smo?(@6tt zCG5ih4p}o>nd6;THs8-=z`A=iN>`5^;u`I+_PNqO)f3mv->oF_L(zyS)04Wh(H#4& zmM;1$v-Hn-?OPnbwg%EF>ZnppZU~FNxW6^LdhB0S#7YT=wBM5H1FDB3JVHn#pUc3| zF5}c8=KZ7LDL;Pu<*O}AmIe!_X%CT+$2b2zc-CHDN}W(Mu}ap{$N1bN@z7fF-ucoT zElws*NUH-ErSt>uNN2_Y5I#|OucT&Id*ITWkoT$WK;p4n06mhXo_UoHec)e>FYLH} zm14v7Q3%&pi6Yo@03(_{tTLkMnv1&8g_C~fdu9q*yP8`*JB}BBql&?6%UQz4_}n7A ze_SsU$bCrt+HSTkMOdy3XVw@Sc@2xQjPXS5^vY=Lu@_Xwa0wI0a@?PY;VdJZP$vuC}i*SMQhh+ zD6HaD-_@XW$kkHkmD_>Q8RqPk{i!fthgw?0Cr`RYA{_`sni@G>WVlFNT=Lg>;hroT zB}nsP_rcW(nuV#w>11`r;{Z`aF`2UzxCPTTpKKYsTczUiR!s&~A2+wgtrIPHX$tm7#v16^a7ih6{qrWNQU?C0))L>6 z@$(?{Oa9!;FI3TepEyTH{*GqHFs0w`knA8=qvy4HOJ^9fKsxHz7wc6J>cMPlF()@I zA9}>+=|Hy};%YJT8{?u9W}owMAjL&??K_a)?GL3R3JKCB>S_P@5Y%RzDoY#_8z=Wz zDe0b|aJd39o|CRa9~ga%8W|o!OC6*2)_A52x10Dt)wahCRD zT=$$+F_w`ms_~k?>X$KqmCZ}#Zc^_;hfDJPe|?elJkZlbgQPP)X& zYF6>bR`VgXnjO);h&{VpX9Wh`yNg$ef5p#7(lAqmm6_4y$h_kZYjKZuY-(W2w4B|N z1VTGy+e0`H88X80;>{I^hJ+YCR;17V>$5Cst2b=PspWPW_aOuHyIc}1^I~#TENNf> zYNk7gm1x^zdVSKVd7*k+dFm0oc-`nU$h%%pW-+ni@Yz1K9hF(dLvo$8LV{vJ@a%s> zwcs_eLy9`+^`^hX(kOZtQ-fkgZ@nf2cv6anW9zq;upTC9{jhSD>KA5segX;cCpQNB z1C68TvD#dHp4FIp$5E6>k9Hr^fQ08hrV+9$;%)X0(T}k1x`?K&Z}nFW2!cEjAKvZ_ zzVn+TyC0oYA<-@GT_nu>?c1+Hl$yF$Z>&%~w{BcSs6Y==s0x3bcze)4GWrdUT;kd< z1;s6=KGl(iv9uk{R4-tt3$zx8INX4&>Pme9jSild z`tBBLu~l!ucRW*BQ1GvEj&Vh&W)XBrDjLLq`r&b^CcGyFD2WYgwG6}ZXP5PK6GPNbr=HPpy&F80ha!IB)CTd{`*S9y)-;MmdLzYry9TOtN-WB8U1n z+GlAb>*(hjtnOVN5@n1SAijLvGHeZ8Hve^NBXT4?9W}IVa_0XG5S;ir@$O5{iEjjc zM5YjE`t5bd=q2R~m|a)Bn#cR!D^jRbF35ENDa@<{Wmi;gShPW}e@twnC_$8lOjJU5 znYBFsL%X6=FPddef2fn`{N-yfJct2H%XZLWY|T5<(-hE;UX>;KHDaf;CJVbSofr1W zHIVe)XW!<^ApE#>^vN`%GKnZu530ccMulzauMwMpk|LBK94z6mcz*Lmc8l5Ly*Dw% z>l3p;@t$(mGC&d;aoyw19MD2cKh$hKB)I8dQJ8$$C$e`34Oq2+-5Z41_bwr>&!`)E zbd}qplG_-d06u^Ml#v>~fT2OEpphUiy}IW25%6jySOYeN10Ga~QG&_DA#0ErGm=7| zCmzHwDT?1SDB9MijQ^AefhQWS$fa8$mc&++@iMx-S^Um*0Q;umN58_ms{?fOP+H+{ z;PX1reH-BWBYpTfRx567fcC7P8N4=`7&ns)2^uu+`9%j5JXnQ{ODS#`T_tCH(Q#XG z;PxJMFUAEiq`5Jhj9^%r_D^1`i+72O2$1d$7Q7VhF_}tweHvw30+|%4;ZcL1tWs-n zv@E=gHkK+2LE<#V3XRr}Pkik!3}QH!Q%J7;NN@jaPUX9ieB{x=KIyYud(V)>3r@62 z$kJmVGhPi>;MA!w@i`iYfvdPVUwseLJ1FS`_$0FR)j5tNr`)qqb00%e>Bdcl zL0YP6Tt`TItU)NyIj{=wuP`41*E+Og)$EpBw>ZYUFc>RQCpUG+%=L#x9DL>*?h$v& zjc>fY;_D{r-k9)>b;c>_!PJ=_YpSkQ3SW3|i_%s%; z)YUdEc56DD>OYG2J3XlCk@7KA9U(U_PO76ZDj}?@*%i!7T-YY6*`GARsPuf3?@AD;T8Gv5x;OHeMaoE9`_{t0!K6P9_sS&vtn`C2leBzR83Rwj>}1} zt(n0@NjcqW(X4WrhF~GX5ka5_^7Y^LWWWtA<77+O$+n05fmG(z-zXKtfbej z&#oB|%R-6M2c{ZmlZ^y<#`~+?jUIJsUuI~hFx`lPv zB>6P}6QusU@0MEAp<)^ZBy+{RDzLnW9(M2~Cj>FhFRcPIbX@o2g!lxl&9171W12&7iinZeG-@)8&}}_T57IYO`drNJ2}A(Kst>C;Iyy=t%jIL&1J1d7(~}l_fW;UPNH!r0 z&(a*@h7TwxDvAf?(3?)BIj}cwX@x_`BLo(B%h3Es{$-7OWq49=^HVn@7OXKJZKaLL z_H}7Yh0s;bNW1T6#QP%P$#7@bsSslJ7{ZVL7o>?juQ4W|-S&%y!nK&Ykcz%zcVlZcpf8I+i0;HA9)08~T>f_t^b zir;jD2~TFf9yQhqxzSnw&oyRKj?BK{gg&~&L`K(--_ zii#A-IY`9|m=nwlX!uZVChLWVc1849Kq{uLF|CmC;-r!n>&{9s_TJ$ffq9xmNc_?O&O z)3p^<>M)-Late(NAKrT6B8{$tiaUM$dDq(P*d-RhI}@xxRxto=nI6uCN4!SxuWOC9 zvUgp}us8hsbo(rDNv2JAy>Qe8+*MG8ZTA2|4o~8nq64f@cV-)2h6OFy+D1lmkZT@4 z`9YSyn{W9YSeo4#IxLN|Qn2+E8J*7-_9ivd#c7@cq~RZhjOrWhjx`Vn>rrguc##*n zLy1gdVbmZ>1zl0|>cX_@%lcUuHHqZ$sX}|`(tWv(;Ku_5-DX=RwV5;kD9ZSMherQT z(E+UvGz|Cu9Y@8N^lnvvVQjh@;$Yu&NlPMnUV(e9mKp};S{pNHz$s55`Y^MMX^JK1 zf!R{aEN|q4HbLDm{pS@w!$L*KFH3%wIiO;QwSmHW^MO*NgHm2X1xusbYnWSjPnqlr z1-O?ASq?dj4z-~`>IYHnK7!nmAxR*I)Zl}EOtTUhXnkr~^xlipZ^>4dk-f+thi z9$uQq-o{{F`Z~Jjq;=9CQx4i<0>Et`753~#xDXKyq0&Pn^5u3sYAYCZQE@qm-vqMJ z8uNmE<8mP7m4$a>Pqyts$Vnw|y|L;DHq@r6$&tV4=AWL86{44L0%rec;>`)h0A-ND zaRz$O*{>8gc6QK+1NKeuiKTZQYC5L@v>Dv?ZlMUmkF$?|(&2;rejc1(r0{pxPJ< zgr7c?R4=apEyj$K6xzdr(z{Qf@^fxG5V*}h)jNBFkl5Z$!sTRZIQ|81pX5W5DhM z%f#)yWWr(`HxPiFjCMemYMn5Lfh6lixqAV z6-I>%GG5h(QvUuK?#_vV1M3F#- z#uB|@Z!!U)K2RcA1aSEWp~RXJ&#@)&-aUATBT~3^A2B!K$b^gS{sNvws-2#9pqU`L zXa50%um6PS2xsBwz6Lt$XC(9i2YN!!%8Slf{kf_s5Er&nZbgm1T&7}O(Y}DG2VD9) zJ6e^_azHXVX)G%5VJl#AUOACck`utXkC(LP=R4--r*twlK%Cco(^)%a<6K1$>SVrS zf@x$1%9j=5v#q02Y6AODI#!Hy5^cKrM^jEihDm0jDJTLWj^O>QCy#W6G zaIB3#Y)I>TGk$aLECM;VwF1u71gt);t9b1*QA5qaqCNXk=ks8$6Q|wL_J3Q6!+~ma zBR}34R`!EV?)l7Zs}xy#79FBPrrkfxHE86mGCo@?*Va;hr`4W7O;TVwkVkZHR9@_5 zZ+>xys;s#`|7qpX{Y>WUro+)q__k(qS^WU&m~&<`s`~x%OTY(?g_%{v>E&5Uj;PN2 zYp5aa+HX7vLAa)RJCXhRsOasvpMADsh&yb>tfO#WL(SvRr#TE&@w&rtUi*SR%N>%l z^=2Y@oHe%B4*FYs!x{Q$B15VW{V%xM7Ldrj%zHh3eOBc}AGm}Wzg5i`tzR96g9L6< zXVn)4BPLk7(C%Ro2juk2kI{%jADhSt9}iR#*80EZmj1XSD%x`O&pmzYA=yS4;zm^(D zSRIBHDZFY~uvEznCowF0of{r?0HlHG6)h9fXMCtS9q5tNl;Iw~;>7B~=7q#FhJ<$4 z5&q5DSruz#v*4rgxFm<@-wokChUwsO8W8-NNJVt7-_eusAc0@j$tc0<#V)wmZk}R`d2ECB$|O$yJ-|Y=i*YVtklPh9$5J4 zbxhe-OXmuEzNKqzI1Z$WU{gPQKwbWR2o&Zi1J`!L0I*v`d9t zOgGEB)UmWO<$-VdR`0)c*x65^Z@-lIvo*4dIW@%_NmR%0)_1F<%t?P`X+0bFI1$8J zIK|ib;Y|u{ZVU^OQ$XckHg!}v4=ojG;EjAYeRYq;hMIdaxCe-WXV+L#(_gxo4GrZT zE!lut$NePJQn}&5cqUvE&wWo|J+lj_0*dmx&m_Dtfi3SOJ-%G8|8_~-d1ZFXnby92 z>~IsB>7o#g^GIskBCM23MSZ(F4xzlLZ^WRI|8RY%2?7`1lqh)g&)6-aa- zQ%`=~$4bY7Ph{)jN`1;q?e zb39-beat?*nCgaBF;(sY+8q03Kc~;TJ8T~&XlkmeDuga7#^LP;6J5s9*vpq;0Z1T3 z(T#oSz9Ogla%ah}%a7u0FHd>%SkX?$D*^AMa)A8Wja(i7Kc$IY)UxbqQ;RW?{jm@RdtHoZpS05R(Dh2nzra08QCTYKMJCxJ<;!5 z3PAmL*Z+7%^M^u}h`l*W`hXAU^KK@l_LQ|SmUrXuOa?#M8p2YVM$QR^-PYT2$K39j zJ)`rf$lT7v36{8;u<`wO8O&`^o0WJ$dIj7DrX@Xa_rgW?p7_3QBksPpU&Da@j-jN_cIMZs zrNZx5{|Q%&fRj8%m_Ob;_+5Sa-EWIHKoIw#QAj$597uzU4ID z38u@ZHoe>4&jI`u|2tfDXi~RL((gPu>)MiVyOufH_*0$`fyIFebYISxQ>F0O9Kx>( z7t!STin^t!x+3-Gdztn`ZS2ot_)wL!%thW|kv`OTw?+mi&e2=TN5;SIB4O3#?gnniu>#PUUM4DF!mdTXbk*K{y@u@%|+^iz%CMAKkAsK zpRBqhSorrqw|w)_a>(%lummb1Ds_Ww&`eujvCG+7|1E3~`vQyko^mub(k7ccxe#Ul ziR=ZW3_Y#7nEWhpXK|m9lJ%GhrNJ!|<$&Cm&*Kbwc)$T= zSk7>vI&*Di_y8sHrVSrbp#`tas0bpcRu{Al33V9!!F1P28+N_)Slo@;=@Z9l16p3* zO~OeLa;HlT{rtUe6r&xg*wX=Mc*`n5C#?)?o+vXLRgW3kg;bEOd(PF+^~1ga0E!pv z(~;c2m6^&B-6l_FXiH_7&$)_5HmNqo{LhxpJN{B?qqJ1{m}lb;3ncHaf860W{u26+ z0;!Y!$!)3t6>5f^5uh5F=Sqcqe(IxbjO}W)Ex>NZ6ytD^Yk}f?%@Ba{ft?b@Ol))zM z%pMOv^VnWy{B>9;Yar9jt=Pn6IKNYStK4;Q=JifljArifO1(sw1 zR{%Z)fJgruu}<9im7)nWEz=@uk1hgB9^xH$hsGQ)k#0Pdl`We2cW&Zsewdv|OaXXm z?oq7!t)VlIAco+B&>j;1*7oX1%dg~1-tU|@jWg>lI46vRKXtjUAG%rAw~oW`_a|zd zY>dBB{x1l2DI2=mVYW&sv&D-u$>?|OoH-8GYG;C%K0EKo$H!4<)Rs!RTytcY!qc#F zKr;vYq~>(@oVxG#Z)Mm+IKHungK4=Tdl5Zir$=wV^&Cw(v?C2Js>)VB$eo5t-DI4s zDYqTV5N8`Av~O>dW9-a9VHGs zBkGU)S;Cy@-8k5~_PcL0ycMd=MlL?3?zG@bU=f3nK5a4n`khn>d?h;G=`lm98*)j} zgQjMTaqs18tzAnD3x?1b^|7ar*n0re7WO=dADt|~Gip{V1DnG$TpYztLnjmHwRn4o zWDT7P@-kRw!`VQ^Ma9KU<}-DU_(klWQ+N0SpbP1PQY;YrB~4CU?=^tn_o;9FNOJb8 zTuhForKKD|KpJzohaKR*vc68*-{0>=S65->Ss+501akHlg$&cW6Q7agyJ(Dh8831l zMm)G<=y6mVd-s=v$J`QRApWyWq zM>^B3+U1_iGwO+^Kg;I{Whd_durtJe+;B9TkI8)@!0!6;RQj@yt852>_ex@5w|0K+ zXwGZT_QXf>KQQ!i)TF}`A5(*}f~`~Q;8oK2h5Cu+lyNtEFJN*fB-6l|J$9e zwZY8Pi>s@vtBC%4*XrvtQ&KxOm7ks6M$OF3?0&JCkBFib$P&T%SCH9(dQ8TGlqKty8XX za05Fi&)8|?eeiR3w!yXNY`>FKgqo7l16sclySzPK(yv?f@lT!?raph-V7XfolqO=Y z;=WrwQ3KVl;a<-PVd{IMq(r>3vSI@y^qihcVH^#h8HY|1x$zSHq)N+?Jd%`@6t>C_ z16z8|ezh}B{knb^BUx@`My*oowXN5?b9>(F;*U??mYnvez;9Pu48Eec=npJFImc1g z)cdih#IT-{wxhNcvWe=J@)CcN!7AVB=pBV8M@bW2#&L12Y#y(&icJS5hdP3Z#G8%= z#WOw(q>kZa7Z)ql*v--rIke+ar*F*EJ7+tu4+gBPnB59ZymNkN+!YG`X+2&ne@_Q6 z$ByRHxirV6E<$@XH6&uGD=fdudFLDY z{ifmgaq)=pybfHF;org2g~e=FjximC$pE9yypAay)pnGn!$Syq+4`iF8Ey^;R9`NN z&Q(FJ^=AY%$4@j`%YJ_OfQ*^B48F(3#f58(j-bibCb2~L9MbzeZ_DhS_lPT+rSF^qavp+vC&Ajy% za-eaibytS#5ByAh<3c^1+!_Timp=TwaRqgGn4X`1o2Hqm_KQG+<^LDv1M&gKRtM4JwXQ14j-fk0>(>Ztllp zQ8UgZw+a>hGQ!ZtPu14Rcq@#$5Zdrg){EfVYm9a4J<1f$0qDMQlhC)*v+KSj`(`H* zkr+o|Df=GPWq?8kFc1NVo=R3M`pw7v)C|-`z%$IFAS|u*RfneHAA{#|iZPXIZD^{< z9-5IaUm_6LiZC;#x7V$KC@H$c0=;_E?gy823I4d zg@1M|YV4BV7cF9IDR-6*R4talq$vjYyrt^|6`s-A79mVLWI23!d2hETCwcY1lJUSjwwn&~=eWvtx#rA11H_S3O`OVmgeS zYhFhKA{tD0`bk8^{g)2o6CA$-GsCKuDE*F#8*+!)`iJg+J+fcB3HE!wJ7v&)_Rv@A zKv;r_wEwik+yg@=%s8hq#v=ugbPm_@PTieHyjXv-^XGL>nwdwMxx`e;N7{}`CbM@; zEq!&GK4pwBsOgC29~e3>Tay!j#-F>R6z^g+cdt+i594~(AHx^4Ei;d-Nrf~Mr3 zjY;M1r~FMzCLgCBp^q8bfe-!fjY`DIgut%Scj>mho0g_MAk!X8P`x?-Bg{V>tB7en zpDB?XiA{T#2{=#X@{xR1WpvHfl%jJfgd7A*nYx~+Ex2OWz%Db4P3S~UY7iYKM_e{GHudx`) zJkvJxFu$yGTy~3M5Ovxhf;RnT6BqYdj)1oO-E7BaR>~5v7>pj0y0kF`!U+&RU-vO0 zX8vqvAm%2inxe909_tigrX372Pm&iF^8;x6%Yk`2{4*nA!9zHUdv5iXQ2afo% zH73Lg$YC|T>4P@m&OJLR3M`laI7c;~_`;=cp}$kA`%kQC@!!_yotf(5f&QFdd+ECA zPqK2+QH)1V68&AeI)p9sdt@ZyRy7HhWYelYpjT2A&evQdC3!w^x{;0ualiw zrkD^m#Y9%6WZs+=rJuilBWEev?%29C?gVF8yma}|Gh-HoZBW_c#2NpI%i!-@> z(JWpliun_k#-gNYR~zQW=-p*t>R^Pg1!s9UMBtezD@z0E<2h>Tc6(#J2mnz|0$4*_ z(!dD8V1D*9niO9&9=6rdNY;(uUZTF;rU(%InWzn1)JhPmL+PRd2p|Cad?u#lSAu`n zE;BT@I82QH9Gf9hP&HiL7H6kjJ^?cuOS#`MstVi4nhLoo^r`!BM)~j|v$;p9KwJ6A z0z1G)^FK1{#ut}b3u0uZSdKD*)hYp2-wTNAjWTgPd9vN&7?q)dJ@%)o7E*9KlwRwP6PBmVXEwhQK*t_N$O{I$HLHN%vFucT$FWA~Z`{sP4*o#qRRE9pq1%5S?~ z8qV)ic?6$>fH9Iq(|v$?71zgkl?mtpH9{%6l3W#py2uV~ zBK?N5x7DkV#guN1IJN|6^r&Y9dcIE`OqV-E+9j4ujFVc3cZQS7qUyqdGH5Knar=QC zdQYlD>#bA#X${X3z{CzcLi6W#zXc+nF%T-n*aQoEHAnxtxU6 zRtJC;>Ad=-5FA-*a!1n$8qTfg(g@1WMKx3j}20wt>tNoLdEC?Q?9DD27Hs8pI(-5Mo-*bu zsva`TeD(r1j@qiB+pZ4R3I{H!az07N*45=~xL2M&;l&N4egh#0%@_#2D-;pym1NV;VGIdmCg)zxJek8!qzJIRp2Dj_j5?D$!ZUI&8Y^;)?X) zK7Iu&*399);p|iMc+K>ycj~$dy5_pahiotV6B}737M*@BD4y@}@gU^J0bh1I1{C5Z zKq21Ey(s#hDje~vzkiz7YP4W0U;vmBf!C_vJ6Wt!Z<2Dy0@DULL+Af3z}9P2x!uw% z5V-6M#D>Yd8_0C@x&UG)3X6Fjz`aw0Hqo&Z$*k!G%Q7yH@fVanBB zbLeo9jOVe~SXf?SmovXdLc>v?ioCITf=NBHr8iI!G4@?fp`Xg30O%yJL7{t_O-)UU zg1P#};?$sVxR%x&xdfk$oEr$@B)X5EuCJQuiZ6k|uYkK|hBo0xf>48B-2nC-#0g7< zoOt01_vR%$tI*!a%?z=2j^hi=bG)wc8l6b&qxI;L34}(w8%Z4SId)y9YX?GGE_|Ax z(PznL8tY?~#Y5Z6K&tzGGn29!BhXGgl-0Vwo%n3boEk3kaNKMH7`h5J!W>A<^>~B? zi>uOCT2X;TdFp}T5sLL9@dMVb(KmW8!?@b1Lxz@#`WO2C3@=|fweL+LB=+Jr`IwmR z66bPBx%2}c;kN%!kzP-F*+|#6`uH{XN6Is+v<^0>e?@7Kt4k18Gl>3lH=PsbBKV$R~;PDIO&&kKcixEY9(wIIEis0QowxIj2-R;8P_@Ka9vqA%ybl^8Z zKV_-iRR*R-00Ueq86f&=dbluGsXMGY^K3hDjd+3k(Rd|i93OMPkkTI;JxF3bXsw{ z2{^#u_}H!ZOl}j5pEV}WY>dVBIS^jJ95?HtL`9JuK$c&90q6Q@tc^9)xVMg-6t3}i zhjYOk&Z9BL*rxSAj#P51u2%n5U`-1z>9TB^W$SW{;$t@hxOX{K>c!S6U#57&dx-C7 z1Upw(SL~lYMut%VZ~phNs4EW(L@Jx*EkG76*os4?en;tATJlVf^*#qi=)Ppm4A>P5 zl^DYGpJD?n8JsMTHV$(ED!fNxK^E-saD$K-TZ+iEI4^7mtv90eiD$CV>U*d+vwKKf z$-@bJz$7!VZrCCEj?o(}Yo(2bur(09zrunSO3$xT#Wb=o#Fbhg&$bft{&5AzlmDv} zg58`4RA7ez24LbGE{7+9mocoD7^L$Xc)py&t?lbZc{e0GWD=E1x%nK@-s0urds+iv zNdRN%KSI%|5O>aubOC|Q=4Y!s-3IQ!@c)Lyh2s8u2>6?+r6puoA4V1WCOWhj;GRPm zH59=}cksCqOsjGJItdxwJ36FdnkEl>hIc=p6}7ehrh6n7#>)6cdyEnii_MMSQh=g| zFzEfQr>u_CtbPl7@Bg#S7S$lwk&`0!7il0+xFz?TQa?8ug)yEFL4ZK?;&;yl0M5o- zQoMMermm*P7>wzrIR&^~CCNZD`gL8#A{vzSS~xb51?Uj#;xgP@E3z8(KCS7A&9D>( z%i|g1_@RfhZ!H z!IF43@ewnO?1hCt1KdY&_@IgvEKh+{!Cc-mzW;--LM`P&r^1>Y>_M0x^7{XOG#VZ7 zdC(mJrUY`B7y8c<QKul#<7*99W$FpU{LwlrQWvM>oC-*KA||8APtIO`>A z#^<2?+a)Gc0tz9m%gV36G<*yxYk)_Bk(-c;KL)q`77JeN*uZ7v;MYKc!=Q5X_36ND zi=`j;WVHdIkcw#;NouVeduIZ0 z3Q>jyC}?ArPa9F&2Y9m_z==HtX=u7~e<7K9203WcXruOz@Mo&_c?|9?KFA0(%CO>= zBKvEgNDFuioeV!KcB@HCn`!vG<&MPZy56J7-7D8uFomF4v<0ex5k>UHkl+_?3nxUt z!B=G1k-VD@k)dt1vrcFnjgs2%l^Qbe4b$jK)=BTklksOnHT;~NpAVoF3ep-=eo>IW z_G)*wiM8ukIlkO%LMowN8t3I+k(Bmz#)Q&CPRO-?7U)0EBN(`?MXWEXjcMa_w_0?> z{$OMMWKh%7LO>&n8_)Tu2CmmwP+d|}Gj8YRqu)6FbNpG^-pIJs_bJ&Tom#yehIPH}G_!GiUP{-ckC4qfK@2&@Hz@iVBgg8k(Umt^+F{s#o{sW0! z#UL~^?DJ6gny#J#F#=IFl0;Xyb!C{quuyM?f_sK`yGEZkFKo;JvQnL-;&xL3Dl&|Ol=iGD7z2|)HXL-NhcH?mYz1~7AC?tCfMcJP7 zWps4jdtfU@M{%yg~4l}qQ}i+MGxH*$Dc6Pe+}P-#~DsU*Ke@) zjF0Y|p<>@#gMT_F+d4Qn5I{AAwhEaPux0+mAQo*krdv1G^o$@^s+Z3v)32I<1)>Y= zevRv5pYFYUk7oHM-5H*xzrXgUz!Y(KqLEjn2$&t^7c;cNJz!X7{(=EZt;G#XXR}Oh zK(vheTDEyyY>FhM{0CM?$9W|HU+&x8YCE+y zkJ6(bgjQ;PqsszRdPM9(pYd`Ka1Ks~HYr5Rgy+5~I)~v>spRx3K4NxNC?e6piL6K^ zf|#JvGfBqMo|SyTKbzb_bFq0bl+*UYh8gc*5U3RRJkPuTZTd0O01(ijtZ*Rf>jD`yA5i`M5$==Pm!_J4*Eih7+dkpt99YKG~F=o z{*GKhMr~F%Y)?l}`OYAReCUS&WWV6J(ysh&LEi0a){cVBeN_RO zAG;ZtA>zKJ`X(Aojg~sFX#RU< z#oRAEazmGk_SxZwhpWk1#x7tB#Y5>1?+0H#rW`DgIqS%$Pi8%K)BW| zfR=FUJb1XuQ-4w~NEmbgo280XyXr+VEt4`^E0T!<1ifSkFMDS1KH-}FIYrG2;l_`z z{HLXHF0d#Y7Rgh$Ftpxp?tZg|y`OU_r4#8mbH!)a>vPk{*jGp4{7&8=tT{M?ZuFU4Vj@2K;CJG!T1 zzokE(i_`oAYW;|U5E<`RW~mjMU!@3NY^kf2DeV1$Y>yrLSLQBL-gLEfS11pMMjvPg zJ9YN92_c=Sosop@9)}+le;99lVqS>e5L&$7DHO(h{_(`GNN;qMX- z$(OdMUPuvLylCrW&f5N(#@YvxgJ(MQ(;%tu-xcA7M zl-OO@GTLHM!0mVNaB#7H(n@tAapE$+hgts)p(i0y0RvS#gF(;ubKqury-60G*VZIk z_t=k(hKz_(oTf(oxK~L20SHhryl|c_zb>VgYQ|kPbG$qoYG+jeM8(=aOUozpO$RvY zhb32^`(!>fqRTuu6yDJKuHZ{T_NzMJty}qBNMMwIPkjEUF0cFur9e zIH9Yz>k&9j@*`aBH@jybA)@~Ip+wPrtvUYjtH0cXvIe_iM>jlsCbV^KnX!0ShHmrp z3x`J~6}7%zPS_0D(1{5ggvWvlleu{M#1%&h_oGu8H;s4BnTrgQz_>I^Qj49GH0h{b zxuJVlKXvU+P)AT-(5JfuufQ)OrDzLpanSavY_=vVvlCNv%MLPwvJMOJf&$h>foha9gTBU*2WA zl@Hj>x)AagPBPMW3#z7O+$u1!v;(Cn*a;x#vFM~rHEO%tSUB8EBJodf!ZlHuhd03W z2N&M%^IPd$Zh)`>pgt-M2-Tf33{r;^wu~A5C;1W?ZZ`IqBofHcXcm66vbU zHrNb2)wl*vfS~Mm4TQ6U96#x|fcd}D=(l469>Y?6C)cbGWsc{&x7y^In;&E~CtA1O z$Yyo8JG?t59hfXe;+5SIOOO9}<;{i2F@c4QY*95T2nZ6uxXg3mJ)Q$IaTGrL8tkG; z-zc-ld4DxuH4+vostS9QLIRg{Up-8fmovK;+~1aX;#)8YrDAGDf8G0AQb0(;PgZv< z%LDW)w@7$fwF<`VZMIT*rbp`FD*?=hUuM1*wxbvRctR7qZ8^Z?U_(I{`by z2Vn*eGPx-332iugt!9WA09r-pR?$iGlpp#7N@qgD;SD_`OXxXimpZSZ$?7q@Cayv|kif=W}f80{R8Y0#M5WLj#f2S4FKe8oTwY>mnO zTJO&kEnr?4x&tlOMlNto7?4tiSAPFO&@Mn(e~JALZ7rMRKf`8qE~y}Q-AFg?rdYmo z7~Z7-IAT5=a~;|eXlibVIZ%d}9^iUj_w#?1!WTu7s}|ZY@BwjMu7Eps+|z|P3u$3; ztWFX3mEsoG{3W?9ZsgIpjytv4pmeXTn=>UXnSUA!HcD7DtC4*g`p=H?3STd>KNFLG z)E+qyk(DAWNbhkrKUG@yqRm{_+M`Syv=5%&Nb59AkoF0r0W^HE*0F$0qYpBl$y zHlNtDUb%rtpo{i}pA|SBP-#fWxNiscUEoBadr|fca1wX^hVX^zAkrY|5NA!YBTlY) z<;^0QLCTNAg)@e70F&=%DaXY`WHc%8eRnwzULFd?uDs` z?!9Jo*0`j;b{|yi0Qn;SWdDh}JY zr0KoNUm!UOvlzm8WJn^r9FNMvT-O&q_282xWoWv-+WJR#{$iL-Oa7mvEbvk=5li)QV*)6T@IVpD8trbg-Q z^&WF}2EFjr5T#8`=#VT->uge|@K(&=Rdd9P=S__XsN2b1m?#h)a7em|$^eD=Vg@Gh z%UnsHRDKwwVRf{!iVL;e97JT%j?3Ih3%-(t+;qhHy|XctL_aP#DRExLoX=6;rrOzL z)r=-UVh4s>kKX{CCzH`RC01k2x+pjYJNn05vRP#CX@GDKxVM_=F^*j_pun97I{}nwC zHoBE}x1E>Hr)b`FYeQOPc7bE6q_JL(+!m#Sx~I-)#_;jOkBAqh1|Ru|uK9B0$IMB| zwkDQBbplT>0=ontoKr>XduFKXhixQuND8I~iWOmTr)+pnc~pEPtTuzo5gIkvoA)J*9CNR}9fMRRN zo>sS9mE6^4pdNv&LMo{#C{#J-UL`-42#2f5mTWLB$(vVs-Y=4RTjZYTT=KFCH_+l!>>ED%@7Eu4Qu3`1} z+{y6mmD_i_iXA~MsTfeb()w4_#uN3;Y_!>cqF|5L5bBwTR%J}7yl#sW=zZbtDAine zYqt?nwXh{5^Kf<{${0tfeqkH01lM;Cl#X(doEKqokzB~0<$dLsUk$V$5QWq`nWhNz zM(SX9uGZQt*o`o&EaP@W;z(+*8<`0%g8Q-*DEN_os7T ztc%t~YiSw2e+kk<`X;f2!s^(2Y`T|{t2H&wy-e$VTFuRz_u=K-q16`UpLBf_lc&zt z>h^!_j2Sd{v>DIo3#h&8k>iWlc>{toper6(AwitQ^G)&nTa3a?GK_|FR}1GIP-8^j zT@yx(8ZxjhyP`G?SlgiPNcrB>DQ3Q?=S8oD%}sW4Gw4> zG`IyAr`6uSAN%9I=cRHkz5?LWjHc}H#$uYx1oh03#@FBVi860U>}=#a(rK=HTQ6>@ zZ!h-32bxfuu9^V{)w=`JsY=obl#_8%MS%Bawo=Zkx4MRFcfCRoX&GrZ{iz8@pn9HN z&unuK96UnBEv6pQ zbcy2a7obmC(JdBXxPhwVsy_9m`W=ZAMy2W`@CB^jT@r3D%*oAHraU<83;Xs@L~|In z)hGR)f9;#k3OuhY`fGQp>QQWpnWXu*TZT(#Q@)vH+iB>`4aI{Kd8}7??mixq-Px-r z)xPUBD!%o#ph=Zfv)%uNykWCeWMudUH)- z=!x40OJa{PbdEWXUR!ue8<`H#BVBVDa7emy#T#*i9cgsgblu%pf=0RK6z2eG5sj}# z9y8bv7-@x`Go>!?p7CJG-*R6mi17ja!(H=Ol3bQ~tcY86k38*cR)C)fqj@n{43MX> zLhCqZ>Dw)7ND)V`RA}U0l7`=Dn@zl{{{f*hvQBhS3~ga*ZrA#y$49x&d!KTs-+31| z$7^0*6|nlJ8vy$0153EK=#2VmKsld<(H>zs`BAph515wxx5gOMqXOinv}u2(ZDZo2 z%Fs^mln$q?sxbF{NORbaZaxIJMs{7eWEGrK(`9F;)pTdGgm>hL@|ko(>O-|c$%em% z6;t+fkRKh+@=x?{f8KY|+MaQ%T~xc_b*1eUflgl}QvnGyNKlCyAb>bLti!M9h8q_TEy3X(15zaC--TJ6$)Z0<3SCDXI1QwLL>` zwe2tA$8guL0m~%u@qgXTfN*3 zUN~$vhQbdW5{H3mxOx={64nSY@C(Y5B(#J&av(U$OINiHf<3x0@-$;4o@*gEM;Gw| za?30b6ozH8V;JYP3u!6#mvY}S|CdQhU}*-ysG^JR58G}B{mBCseBD07FrZ3$Wa$=Sca3x4AT6$zre(ObY2F(%|*-_&G43 z{sV`hpSoaVi`~H_WXxG4J23w^>+UdboX$-=`5#ZPe~=&5LW51c6ri@NzO}s9r$DkM zEg8oj$JCj1;&UQ&n?>{JiUjdD2r*N{_t9AiQkZ>&XRf?7m|Zr6RdN?ZlY^nxBw@sM z)%;6u*uzeJG(prhB%(T03oo6cCqR?%DhKA~h4WgDf+=no0}Z+VeVG4mDijQpHxOd~ z7>W!@!;Io|%Ffm{*&v#Qd$w_IoM5WteiUGcDkhpSpxRW=JF3Ql6_PALS{MYn^AWrF z*n_CiAAAeCQ;qcu8ql)23gb@Vtjlc^f>rj~Di4(~5XaEe3gZKFxWiE1LsKKf7qc2r zSGcuqVnf?rXhraf4U$b;%oZ7k`0f2p9~p5Gb`A*nf=r+EGom~;OZ@}tX^ob*!QW`w#uwi$R7O0i2D>H)1l?q7&ddde*va?r^)~T literal 0 HcmV?d00001 diff --git a/competitions/getting-started/titanic/image/titanic_output_30_1.png b/competitions/getting-started/titanic/image/titanic_output_30_1.png new file mode 100644 index 0000000000000000000000000000000000000000..47297eba6ed175685cb3a3fe7c121de330bde175 GIT binary patch literal 14028 zcmeI3cUTl>*T!deSyrh@R6vko93U!)NSA72VWg=z^s*|_n;>mzyTM?L1s#-LjH2LB zrFSMavVjEz>1`7c1q7Dfmv2y0-YakW{(reHc3~K%J*5isi{?Jg7lY3Dod>+mT=ezyx#;P3RwnSgpTC=rx6FZk%KP>!$hcm-=zCB}>CYYZ z`S`ggJ@6X50{}7rbokKe;FR(H@NzI9lb&ujP4}-fO>divu(!0&M3-&hpdgmL|3Vp>eF`>2GQ7K~uL$^ig$qFj zIz_Pt43SrdL;wT~xJjmikfhT;Y$hF_wn98sq8E(Mb(4@3jfQbY#EB^opE@_6JXd{y zsRrd4P1k@vNf|6tXrFcS^tv)r5Xv+2JBK`VmM||GkcITLG6MQ<4QUnG9YewnFn<}Y znb5k{r5ZL|5~6!Op!zFj5VESntcw1;GfPoC0O0?*8Fi02>8Z(#zTY)*Nlcw+N}SI_ zVJMC8UiJE$AYg$1y)`T>*G;t2x&B5JZck!5WTS7e=|i1r6d`h;TXZF9>o_lZN_g6qf-E8L{ac>2!6CtOcNFzJncn?l)_ zOp1KY!Wy)X?9iqW83i^VDJ{m*eV!|$priX2H^IwbMTIq2IGSUT4&_r<^#OQp2CY0V z-r&QpBe0rb&6S4(Lbr^)9{d*3k!MP>79X=hm&G~PK^Jf-HDIr?pF3iizasZ4Og#wf zM)O&}fXya_n#^t}&og>0xw8zKzCJ9`@|NDXx{KC#1{<_&4r1gYo*I|t+;1dwlE#c3 zEZbwUM~|~bi|jkY&JgyOo6HR5J=wM5P?e3&lepV8w`mf)L>C)s)k8^$}HP+S@T@^ zf{SNcOdja_F1Wxo%FL^bD0&cKT}m?$J1=eIw~6C1`^D-rFcfa5JhYm9GAu*p)G^E+ zcAA82;+Vm6MZ-1u{=A$BHb3SO@|%jzrmo!l@i-`L^Qs){P!-BmX0DB0S^t?ioL}!W zs2=w2#j2A-a%bv!!u}olKHR)<2}g~Z36-=(X#?5rM29D9HgJFX&}nLHt-5|{L1P(X z&~$nb9sQiP{?g70J~pRuys#eZ4gCSEntm1a@*w4OJ86rHMY|v-2yqGJsRZGztU|IK z!S2@3*EHrTfhuS=d(`W|naAn#itC0n(J$91v_Y|2O{Q}#Y1I#TKHJou+R``(bmgNF z9fRdsz5>(Msf70jnMUbZXkGe#pKYAg+7U^5Rc250u;dI4ko;xr@$236ilQJSHF=_B z7sH65I3oKb3Q9AY{Oxh3KVMDWV{ejr0~#c4`H(Y3eJ5jcb5QR` zsBVXf&WOw{r>h`kmeoUA{OPWb9#iQJ)AH$pQY59oQYGTq*6f3l!K9jfE&OsJX5=vP zB9E^*yp}%vzz6Udzv|Mjkr7F};XN$LrxRLu-J<>0*iwp|wqXkY>Pgx>St~MH)?cvIlbZpH0>K-zKHM)W#&WufNeCat|_< z2r)o5BS6%AP0iOC=_C!NoYfr!xKf&BsFb}~vI>2#{Ii8Xb{mwJxiHC1Pa2QBIq)_} zjibnJa;wgqTROL=nM$^g-c8JH8ZC3?hGZ|TZKrIN-gkIB|NIWB$! z)FxzFv=i=MKipc7(D|Ba{wYdFj}vJ6iHNFRUwA^aCwKH;mZ}1 z%&4fR@1sM0N(uB8>)C&f{~l02S-Wyn^4pA^2QnYLwEb)&){F1s|M1U510+MJ5V#3G zin37zkW~oT26-D#TDK*u3OAMOFmp$AuZFxcZ*Sht|83&_t zcT0$O(FnuI3payWwGDJO*^zZlz#d7?+4S+=9}czpCU&+iy3TCHiWpDnsSHP=Iz=co zV#GO+*hl^F~imidZ&F_<)ia>%8Pc6kxV8_d1Endyc=VG8> ze2C)pE9&`#@|1xihb&1Ns=Ws{2W$9-A$Kr!Y`52Q8MpFA54W%kcMIVyNn4EK6dt{4 zh>tla;I_l*6@jj$DERW4e~&id_`gf7|03i5r_Hb22{)axK^NHyEv5-HZ&Vn7@W=M` ztcC|BD;tXMo3RlW_F4c|6@4S%noYsGeYbF=*!R#La(Px_T(NILXRDZNE?`(F%Spy| zNv>ROaN&2^qZz2pDcDFv$)WG;#z4OKIFKG>1Gt~R?~YTaa|Z)kp!+k8LP<%$4sfMS6Y5-;TG zUxoCX!|&*)|2!Bi24n-8L;TPhYxV{-@2#NNg#uimE@4eHF+yzd;@;08B%?!+ zw(D*<>fQ`?0;k*8SoRnQwGvfbg)t?>bJJD>CTqj2tFS?GFm`0JjNm-kx?T;xdI7dkZa>@FFj z*k1n{0a=d@DG^sYj5S(T!+fj{-eSk2#ewKX>y8@T+`|A7l&3+xHdKOM=7?4tfWRhUcu{aDpb4{+!FZsFGIh z!B)peaLL0&2ePMra5-|c5#M{`%Ink?xOM8x`q~E|!h-9hxNmXxXwdUjQQkqS6yUpKkXP zIN%hl_z0D~Dyk&4-xyNFBcbwiF&V2&UO(L~jE>$yX<4YA62@hII!G2Swnp&@d!1d8zoKk(ra8Q_d)kvOpW*lPwN6DX%`@*m~PTe;l4(+ ztCq~@D5_|C-4ve^zq#?#(H`GC;>Os!uq6oiq?kJM5A``7Em}_35pRMcZs8LWoi`3% zdOY(mYQZ&JV*cM1_FO&#|E9gsmp%67iNvW!yeDkLegy!8M(=j;(ZOajzY9n*l4Cf1 z!d01v0!afTyVGzLAgb1ei16YOXiM%kdxy?tdJa5>i_x2HU3@hTsi?S@HHiZU0~hY zPAzZZh(?6Kz6%tPAVDRMy;;V{hH@R%d?n;x z8CsdvwyT1dYP{;gZb2)_l*(lMCJO=?T+dc9>M@d&kx6!!?_FvQwy%v0ns|jjZ}3qO5UkDQPJ1xSu}~8{3yK zR18vND-@Wycs7al(O4Png#uHZbd$|W2TS%Rw(-Yn>GH#G_D2!QqXr}&=n<>)63e!6 z!m0jjE2=EVn5xW?;S4|Qzk+EePkAkOgb+EU&Z{J>DrxE}-9Bk5obGdDc}?jRp=+}h zsXD=d5VnmILJKO?7FgU!6}H5+Oae;-Ao7C+JTj2o29RQp^aA z13zKe%T426#P?m-9PMCXlCXTpXNxu}>%44P*5U+YdUrwCo1a#9Kf(@Cq{ovSk4x3{ zj;izo?~?Qjh5Lb*YtLKt1=IElnwg0YX_G!r8opPj%+Ug z5^8|_JXF5FQ-`UCHLp15jwd@bY)M~xg#8K!*dymg;EXkOKq?E>jg|<^G%0b^J3MKy zz5)dj_a!<>#!duXC;QWWRnBz5JOJlOBHPRD_!6j;JUrdZGXWKT?mPQTG<$R>dmhYp z>qrNcQqMQiXwAfku4~d&Y3fD&lSy@rBU9&0&7Kgc^LmH#wMUjg4fVFRDN*bV_8rC= zbMsgAwL`|}+ZI#iO2c<7W`prBt8&MkoT~=aA`{zQ*Qcai3L@UdW8_~VrOe-o=MMq& zzuGKz0#CAby&Ban;tf(ahpMf;*AdXjx%_=j*Zh zQnIX;B2$nx9WU2?w)BG97`FkW2%gb>x||be>K8I&oqEgZb$6cmQx{xvw%Ik?#^xBG zg3?ra=dSnMJ4q>^ujZihV$Y*vb0(+m_>X^s1!5C8y8xTX5gy_Vb)?0UaE+J@GY?bZ zwbx|cLv2*LJ>WU6+cRZRJziY8XUZ=r`0n5&eqNuEc_5!PHc~F2DO~RK?1I2|xCNh_I>E(ot?{wnMkMj71ROh8AE!2!dn7DbYrrvY2w03&~_`%wsYzS65^FO!lq zWt7ysV2gz&H=pl!DYHhZl6>}O%4vPc{1Iw;Z`(U-_K0<;%Hps_W<3O(@M3YxX#hP8sM2$2LQO+kGo?(a|z=GztsQnHnEjp!1{X zdhd(xuv$pn=V1&_Q^M zfp}AO-da-irt%up+pBazpYdQzR32(e*BayIptvptG{rPGl^_c_Xx$k$A#-w|30`3@q6`_!Uzs)^l<R#XTux`r0%QJa; z*(>zi>1P$JKN*LCB=dL1L9>kZG01PN+1iCTe4&#t@STQP#dP4e=Jb3#@)d4c+*c5Z zI7Zhur>^n&subYiUcEt5t^1cG|A5dOmtgAGy^$WJ>r7S4V)^cU%rV|I0C}H2)XiFa zF}r6$8qg<}niQHKE53hwUAjCoGr*Fx z4y;PEt=Ly_^bjwgS`6sFbC*SB<1zwSy*;=;qPI4!An@ zau8Nv_QvMi>&y-cM`ABO9yke2HvFKfGh#mydJIlI_}d9k@Jfj8r^>Q=(+RI~9S$AaAlQ&ECFzh4dUq#;d9aOAlD)0A&MIj-tzC&UyA5%)+E|CnkjT z$12*v@^CwX+7t#){U-ACwyK~?lZU@el4FkoFu^l#QfQ3;6yhLhAMXv_?oVXH)a}33$ztU1ypQE>RGa+8y7%4*fQidLa zE3(}OncqR)UR=E@)~CmRxfB=QBARum15fqV^3B1*c(pN3w0mW2WTT1G7lC360POx^ z(S*H~j?#)zDL{x6r^bpSTrT^a6})s7#`x*&m?&ZwJ2V6)lbUnUP#~$Uw^Zz-(olZs zOeomYwPn#5t`S(=-S$qz_%D3v8uHNP6P8FOV|%DDY(HR}eq<7xx)1geR3jsG{y|fE zD;q;?`(J+?%MHFYL7q5X=%F%Zzz)SKFlAQtIl6z(?{v%R)%5ZpZ%UlgIkq}maQ25A zDzQ5#eVeNvzWH?f3=3TSRfZ?ME9{|hynvr#rvkFj=zmdTh-U$Y&0|P)UE+W; zb6`AL4#MXd&L{>6D}-ZeZNj8lBe7Z9#0$+$In(*A(#b1Hgw)`!r~D` zkU=yo6tn96$StBofOMW(qG$5ZQ|tkD%;xHGT%x+)HOG~n4U+v%_x-MoYR)|S)&?Un zw#LT>D6d5WDc;`tAbz6K-DZT-?KbR6P`l}n4g$If2`ulWkJGn7c|b&^w=-OW-oej( zDw|gk(9>)aRIQV;5?U=l9TJ^f*cYiDY&?42c!uX35ugt952 zyX)Wn6hn}=Xy1peVHeu?ty{mx_Yo7DwZK;7Dh@Es5S^r<$DpuaL7xydNtD-UBRCLJ z+EuN2cjy(h$gjSCi1Wv$v zV?C|l7=okW^mwL6`X^EWQgV`YyV1^~)l<~n*bVJv5|7&}JbXn|jiTA5_QkF*nb?HR z5rVUJzki# z^rC>|yWqrc*$!b4fMdtRS^Af5ruePul)ApW7n{CN{dR5ix93+U2dPnlQq7M`JUX4{ zM%%0ULnmwGNipj*pB8f<->0Yc&OkGeE?m7lB9evb=cAISyF9xD)5somWuIk#|FByb z(;~f)al3?H_dstrPELjA*53*j{KPBq3O~GbMyCp=Jaa>NpM(3YVSQI;-6skWUv%M& zl!k1fd(h3DD{AXEh=h;_>|Q;>*2U5Da9xL=O>42|{we+STG1aiJ1%D)By(V@869-7 zY;TJgb4*7odCD`k*9mro%8lJp)yFBy6OH!WbS{6GLyNNvK4~V5Vc-?*g)G#%S{OLq zQxdy&;IcEfhZ|zP4Nx+fK7m|j#H+p*mJJEM`2m+``M(}*&rVl+yva3!%u!&!;4Ybk z3L3y(U0>7j!~NWmV?FZ%e>e+6_)AdD0^yyHKIb~l##>k0`-WWDnX^JPX&jQP9{pPY z`xkNTZ=179aG5?6Yqt^R_9E5y6YKsUwzi*;rL@;PkVD`WqY>M>M1zfxe)ltdH4eb(;a*oSux1ZV6XP<=@eEO&ZTVzqbJD?}>EQ z=yPc7Xs}a#hRe@6$!-Me2|>hsw?8oQMwr}3E=-I6lceDA8E`w5zVf2PT%#5H1T|ZU zL%H<-F7o|t0Dk^Uz3{hD6o&{mhi-yppmpMq?)~pSwjQ1wcuO|G!=D-HGO>zqOt#xA z@AXYpo#B#V&b50Kst@%%+;0KL5Hm@augeHQPjSM`-}n_DeVkFX{LoTu22{iwCB>M6 z_)qF8%xc>4`%b$HiWBj^imlD{o{*}?03af{;RSG?(5007ygFq3hILfNqrz%5?%F}2-1HS! zn}?0~kjyYvMmFQf#8^Map9tF<}p7(EMc)3`wc`vnHQ=iAF-@t!74J>9d;2 zkwOnrdo4<+QhmJw4DOg)am*qWY$@OkR<0o~?qSc&PB;aL;i1cjda{ix!+!6)3b3o4 z!&=w$Mi<`h&yJKBo+a*R97z;TBPP^DF)Zg&dwY4h9&fKJ|JcKp;gmRY1u+~6_s6Cf zOiK5O%tjnrD+g@>B}(mWortn_m*T0T2Z!d-N`efN4{zY0phk;!f)2k`oRcW6tf=1= z3CQuO8+;YW!C7<-C!tM+O4uJtaIlBXD)U2-aC-Uz%cfVOCQe>&#hmdpd$ba)Z_PGb z@dEHhTd07>d(&k1y0GsD7~F-}_dIq7mg2?-0~F2YGuw z%zUjxZ(#5!AETwad6MTVTp4te z{tIu)e`crq?}9P1crQ4(Sv5{qh8a@S$CZWR8hyL1ed?^6UqD|Kv9xEG>@dUUuPQdRk2HZ>*XhLwahL=HA~KiXJXn9;&Rr<{q&>lQptE=j~Hy z;`nrIcb^pd_ z41gRCq~eWKynry7w}ERe92`p`r4!XZHwMm3ZGn z^L#qiGkjDby*oo)&eiKhi$$R2Y)S8KW@UgK;shbrDA4ie|nJ1(nS!K-9ji}oONp>e#9B9uN(YuEGdNpVO=m+=gxaxH*G5dorl zRm~>$=@d?uRH5$r>~w}XV;4Pq1b#90UT}3c=Pdg>om<#sE-H%|(Fc=G=<1(>>m1-r zjkc@wHu|Jx`wF5%36}Of0grE1pxiJk+|1FZ50WBC!2?b{UorEx44AJMv|k#o6X?TT zLnaAzI}%Jq`!gBl2%eB@pa?hYO-Hy$B-t}%h|F5u#ZZwi*1ak(ErT-`!|80+BA2C6 zS5Lp6yq=?PI4LtV*|`vbtj^Q)vVF{2!$OyZeqb%mlWHwlg7}T4nVQB(nh^vhMk@K6 zy|@E_=gE&$HA876FAMDjkg(4e18D16>-(9fXe(#6Y3nEQed4svJqgsKq*%k!-VYbN zdK}{hM&4X!Oae+94Q<=x@ijA={Y$e`a2XI4$l%SIYm~Nv?OVX*o!F0<8Wbo^eaUV9 z&6q9Bh-o$Aa!DF+$w8(UH2u$SHKJTc;4hb~EEcv$L5^^{Iy@>3%AlJ19DP(*vAjik!! zd{O?1;S5#GqHi~RDp#4cx1Bz(x!N#qHr>#oof$u5`A+udCdi>nA?TIU-fYws_ngfM z2phYum{YfNuGIQ_M)K%xF`r}7Z{q=D>*AeV;a09#oC$|Vz=EqqJ8AvF(hq`aIqC1<0*_cbQV!sK zj-HW?nyE5pg|U}-@)C{Jx$ibzi!MjBk(XkiiTgoR@eCvHJ&{7Ik(eT+PPQPesEJkx zx+}5Gp^lcl24;EaUILN=n?-TY}btyP~*t&`P_3V#>~UBf&( zwf7_rM8we$9brxB#VYim8`5b;HqAwihBZ|o6{=}aj4qIL3$2V7a@5&%T>@{%RP!2m zE$#8G&x*RKsyaNIFtru0y(OK5H(rZn_~_pVX=S`yd)W5VPiF$3R#Q#2H3hs99+OW3 zeL;iac(c3pGhF<ys;17G9);rMtkay9aFNo> zXWpvXD3{N|rJ@^deK~}U!zGny1R-pDuvP)w7O?3L@duY=pm7Qg)#cCk%bGm>(`up3 z2#@uQ1F;R`#74VXeZx5M2kIDB1LE7NzIx}*j&`<{Crt9<(ejHfUd;!8z^c=%(KBEB zpJgJlC1Fd`Cok{39(C;?us`#`PQ2F_DsQmiH2QbN^#8B){>$EXan=Fa%4+Rpvhe0z zp(BzTYnL$8F+5q+d+lRM9_(60Vf!#7<=442Se&%bh`{plRaAzHxgZ&0eR04-I1n zb{<%zUbNvgO2MU4qc-j}`s2wTQOsZW!rbu2s5fcAApf1F3iAS%ih0=F_;~wbt|dSg z>cNnWt-eV86VYux@vjlx$H&XN2WXwC1SfqS1{oIyRsFc(D-q}+ae}Z2>?CXUDVAw_ z#SjNM-YpV@adS9Su7Vc_S3VO&92kk*#qBQQS)utXn_R`bylC9x0KR_^G_Ws+c>qj)|ie3W8^{< znBR~T%1E3ljLEwAf$Eyz(6CwIBILh;cIxjg5Ei%A^a6u?o_y7h6E*JC5xCdny~T?K zIRS~e}q7-cuP#d zv4%Ex_sntkD;u_v4Np$l@W6Vf+>)lRaKX~nhWEnNdzh0zGGN>K!Lf3Lr4jwRcMZ2c zTNOGnGbmdj|6lOS`}wZ!e@k#wzPD}NfnVj+zcRTv|JIL9K^Cprk-@(RbnbG)i`oZd zSJWJEB+v60`US$e<7M2rznE7FrJ$K8Ek(sJ^Hk!yI_xdkix-E8!%e5j2DvVF2&hQK z{}Alv|1=#)p6_3kH-OvYa*yK5(PpX`@8f%ES~51Q=PDl&%VNO?&*9%R`CkIVza^dj zBP;(uvhsJ2%B$fcbpMHs_Q~kcq&N-9>iVp2PShuMK0R;hk@|QKZV2*k{?co4_}+<` z-xKq2%AiPe!#;aR<;;gH{g0m)_)nTl6~STHPRJW7JirV+`n%ufd=no2C5Y*~xPPB? z<9K1OUHwkCV4U&)y8VbSHh}$tR^rB3vMEJK;?NExy1qJ2B;n*%pqCUhXOize4B2eB zF&W#Rw*h~{E&oF-|3fVQLoEM)#NtY2>9$&e6?)7P*7cL1;)&slWg+qgf*@`J*vrt( z9(@a?kvw&ND*#zHwG=}E)vR&)3Of20?*=)-{i#m*;Glb&d*C5t98qvj%L)Jtkh(buIlNsH>ZLB0UBNgXm|>z{6qG@f1uTnIjyL85u# z=;+Irhn3vE^tVYPwiuwJ^3Xgng0$#Gqg>rW?`QBQ_qzCO>T4@@yfq~KPFK-OUwCWl z>RD5~O-#y0>)+`f8AFDZlYIylr`sdOtcRXsB+UP<+}BXs8Pg!VJ?bh01$w1DsA-a# zO^dB5uSobFlu+xx1TK;AHmL@~Os?BK?_69l&SG$}LUOC*Th}hwR={huS%Dk~=s}Yg zlCRS12(@2N+Wp3W11ym;63%&)?|*OKYM$8kF_nF6cmIcvTKy3^XCny#Kb3sGIr>pQ R{>d`{I%0Y_?`!I}{|Di`ngIX+ literal 0 HcmV?d00001 diff --git a/competitions/getting-started/titanic/image/titanic_output_44_1.png b/competitions/getting-started/titanic/image/titanic_output_44_1.png new file mode 100644 index 0000000000000000000000000000000000000000..dd3bab3f3f98eb9a114bbbc488ce8fdcc2e1e347 GIT binary patch literal 11367 zcmchd3piAJ-}l$dFjG0zwu_vrgd@Lo=IK`0DOzquCsHG4>JEayO zaz3q&noWd6F-+S;Im`@$!8p82?cKef=f3aveV*sN-q&?8%=*t-%UbLI`+q;*@9)yj z)_bLvC@cX0Knk+hbr1lE_^*Vwi^%xPe)FrY_={N34(RYA{3~LS=SBSQ#r_r!K>#4} z!@@VAg%ve}ziE7WxBclue#cMKJOYmar#w#k`}&>s^**8&ax5^&+mEWYQAbZ_gO=Kf z)2IEn=<5DsfR10Fm+swDJvRVA4S;rSKOCAi%nC0!b}Wo1;@K!vdHkZG9de?Puw3`m zfw%8YyqXBza5DVYA$Rf3mwu+4OK;KcKBAUrte0}xR*O-_da(58i;An4{B+na-k0+O zWS|}EEas$h;Os*+*5&q7Atyw}(GAUoC+vhNJ`cm9YctRTKCyarN`kQPT(2?xbmP9{ z2kSuSpp(i*E&xI;pB(@TI&UjgjA&gqA! z7J)&Cje6K~>bu!ikSBjEqVi%d6RNq+i5}KT2m8UWC#Lm6n?f^YEL4-oDcUI&I@h6+ z!Z{1l!1q@xKQAQ~QupP~6J#K^uJu`p0PpSlJc6K$w@V^5`Ore-vS< z3>lq~sc5pZD{PaG?qnPyuW*>R7USlinm{%|Req=B$@nR=^oUvsBZ5)1Wd7DREnrW- z0(&t4xu$aN<(-_HY+SD~bq;I~iFbS4yMHw^r?MzUF(Kl%)5KfQjoWBS-52;O@N6y` zo-^Rco>QkDF)RpcoS=Rf2}jI_UIrgRgs1D%S+C8tm}?Tc5Kh6TD+Hq@Mx~Ux=IJL+)($E%*={DbTuD1AwlgwbzkiH3u5(;=Naw01aM zWOl~uu_}2^EEIwxO_*k^3O}InJ}N+l#E0~n2HgNt@AUSmpW#_nU|nfKJRk>#$@0V4 zN8#~Xpe|XOkWLaSGhhmG1|%RI@otK#(xe|8EHFJ!70!`pmfL*t3+~d)6~i%`(xoYp7hCW5)aUN#8ldTFHbEr((YqzS!y27mX+E~ z_(Njs!^L*u{GNPe?p)UtHU+&SHDwDk zM%@m#^)v{s-x!l#8cviUMO6rv#F(z>kW^aa%WZtAV#TYr_D*gPj&LdyUlHBM@&L12 zoX6<2D{@1Sw%l|{nF%u;3&2VVh-nSF2~b3;elXg=yuEF zxvX&JgJtZ6DYKb~_8%O8EA}u8;WRO#Ek=XRXcx+cI1aNgGTV)jKe|nQCW*D%{5H+K z{8~3$TARd)Vym1PE>^c3wRc~`Uyn_Iegx@RR-k|5(|U>EA<2c z4PO!e-wqyk0l@cbnB&kpnt-B-*PjVc4t(zu>akBjUEq=u8S@25lk`o|pC_;%qg%wv zWQ+o?G9ZQZ__lz^hYvwqcPN>uHHf^9iAV{phYY?H!32uMwV%mfj$Mk1h_dJTkI+gf397#s@c$80d zpL&*-bZ?uvGJOr*PQ6@qQmopR{RK0FY)=p75^D5&5^@F_P)8@&$L^t{A^oblRCD2| zazmK6qn15BXCA=C_fTnKoOoU?+Dp%&AFXJMqv))3ypHU78jY>g z`9n{B0lkSgc`a8Q9`V=7+(&a5Ih8%N1XC~mDkw#)uhO8iN;dOvrT+iJt@)omA&=b& zS2-AHr~dvrClS=7)!S*5n=*%4O_3K=^Rhhj`|oZVYbA7$tE$m-qN!ZyQ3PK(=X8*OoDS0Qu_`WWnP%xI$G@DxE~XLl3HKYZQ-DvAv4ysY^@xP;lP|K$OHxNgcqD z8%*2DqZ3Szzj*>F=mxo_xJ1eU&;(hB*nJQ-cYH3e-dCu{e5vhJqNTnEI_lnNS@bZU zsK!hmlNRfT+g`B#;AuFpw$><=(Q5i_+it?eCFr|6$KsPq^(+-O;`@u7jaZ_Fy~pmM z({Iu)=!nPhqRdswJoz6X1;ZfMlg}=jy)!0GMzV7t>gU8+J$@8B0m9coqU+YQrRuTC z#=3^?Pl%c5+_G0XDp4QZpItMc_Eo(l^TdAo*9-agyE`7F3V3hNc8PLw8>}ARZKD(j zB3V%#gm3Hp_hEm#Rh(=<1Ix_M6kd*iT6PS5RShP9l;YlOjQa~{MK%EzJ|#^Dtseba zaWbU!@%oUjLP8mU6zCr5mxSTNS02m%`5t}k9}aYb>X4+`-iVB-gS{>M_g@bk0vKGY zGWqSL$BIhdUWJ4V>*R>PKCvq0MbToY#kYX8AX0_^t?fQu6A7l>P@GjtNN-gsqkWNN z27WMH#!$97g60R*qHR!_K6`u&EH z4M|^0IMUmYt@L<6S6o`$HG%VHySRpTVX~mb*TI^?HpZ0UaVa{O0mKn&HeJNXuj9^Y zwEZ}HDh}*-nz5~hVl9hK8MgyUK!i`@Ods`T)_c!JOMOpcWqLzVZj|?34#>`;PG~|@ z7x&8ACBDO(FP4s1{_(lA^nSqp`wsG?E8a>wCp(8|N*Ot5Fexfb-nvouOpoMrPw=*e z8_dG`fc>9)n(`zK`;!g--amq6T6 zIyqe+yWt+gUn${5z?he@AHi8d3E{B8a6%n1Nm@e-l6{qybo>=!0>H)E%v5^)=ScAL z9XPliTb*EV053NsT7zZ)Jnq0QB3(Xqp{7SC*}rH*9Db@sag3Ea8sj7Plm0N$i0S>j z6OHYnYQ zbv>F`!&HItYzk{q&5fAnsY-IE<)W<~gCeke1GBwycC^a7Qx)p|FuRqN#^EKH`YH%t zaFYeSUNtsx0i2m#KLUrMVwz83ojC(*u^`6gkVhFr=L7Hs!0xo5mF?-?nl>c^WC!;y zRClcd+Kns6%;ZF#>@Q<4^$kU9m;_vC7Q?~v)e_veV_SqTsv4#)dDPg@AAO2|Ve|BXn7$Y(Rqbi~gS`#!rLGfWxnMXON~60G zY9Lx~YicZ=?zHBu2v z%OuJaCs+9^x0dRL)TODKodhjrH?|vdLI?Kt#(6zLs)%K!XkKb% zXgaJJS(^)_FP;g8w~{8;%689nItZga9l)Mo`Goka_H{k)vv@JAn=fl0Qf~I8ZSK*m zFAI!1xP)Fq)!Xs_8G?A*{9R#}F6}ug(#Gn|VT9K$cgu(aTyJo!*|K9F&SsND;QN5u zi{2f&sf^cf@$J;wtKEbnDHC;mgq~bf8xA)2*W)Yh@2Tlf4vMO35HdF=re$0wx}WsZ zO9B5=DNOGc{UV!#T0cY$;ozm!(evwb28`J2*vZslZ~3~dSSe!qz-es988Xz?nNS!mguvMUh*1Bkr2bpp`a0PCA~Ivj zl!B<%qYuFNj@K2@DIDcHSN!Axl#)32sMAM~Y~GRTAf(DY{gX)3A%pW0bh8|Nf15H} zuV-q&6on3pW8Lep_7BW6o(!OYSYb=ko-fHlc%!+H6)DuPJUsVF&rqW_${YdDIF z-9|%97C@>>a?1SnkB;23+doQy@`hDvPt@Jqh1k)&gGy+vX#ukyoAXkUs%)o;A?ln5q;eBq>Mv+}Nn!`Eau!dH zsJkCIMm^*xDTv%Pz3|tt3xf_xLc^c_JGUR5e~e2Y4Z})0eRM_MV@j zKd!;Vp>DAVl}URzQj;0O*h%jP2s&1yGwAdV#F8N-$WZ2-V3ui%o?uxYUR4r*50 zT1cY%5e~PNfaYQoq!OJ#t(MkS5}!t>+SK@5(gMy8dcnby zUE(h5&)lBB(nyHQLlqO!fGQvp?Y_!sXC#p}gT_^8qfOy$rw|9E9d5f}Qr-7k<08aV z%19#A8;;D64=}T!T{TfPGH0G)PmRqTnu|fFQ^V2=5qzH8ustxF{!89^USg>X65}=N zea9Wflyp^#+IJlM>_Ujx!t{p6tHJgC#_L0W^rs!-r}ESb0r^L0g*r2X zG=G)TKj!GEIrR8LB|{3UCAxiL?3>iBjEUO8!vfb-=TM_{nX}l|u@64=BbsB~y7T$0 z;Gjb!!ni;7A8;2sLj{2{m211}oa}tE7U-B`B2(9A2-!B^6=H zR32-biFbKflBKTSb0LEVrAtL9bjXDsLxhLh4kcAoCq9>5qrDpmq6eV^19}%$d%tQB zzUR~d>R(-%!~uQmRz_{;SfzBo?Az&_Z-)RRr(m=|8fzwtkkXao(86F+jYb zJ&u4cYv7js-9%i=aQ@`Ff9Lf4_si7(4aU8t5w3d1gYB9QJ+D>MJa;0U3C>qN^FRK& z*86$`&3i)o9?5@Wfp_B!h2FX#_+zn5dp z&ldgMa^fcMpyKo}kS1Lv+tK}+N14e)OM(UX<49IY%^KOz^OXX?FmZoB*kvAM}e(DP^mBRHoarQLueQ|SSEG%vg%@2!k_{}&RsTC}L;qdxNA855;cVNs6m-iXqZMU^XdEez828221Q+%v4(wA{EO0ql z7sw^?_hOw$*LL34r$rQ5Bx|u}jivta7f*9cGtoS99ye;9%ujoA0rdVMnb+bgu%`e2 zy*2-|HZ2%jxL-KxLSqHTJ3Kn_!U6LzofL~U@fVbw>U0(6dTkx-R+iAl2c;0!+vXY# zK8lYVI%NCMF}P=^%IMV(k0L}9!}tZRzb)(vYI4AGLbQYnCd2V(9yzuTr<=RxX9)PX z3vW?4C(0sUpO1}RJFfvD=Rd77neR96XB^3%7cA`lZ|0=;Cf}Z#$ncPnZJ#3**m5hh zaCA%NpkOM3n|2)P)bC)DW_h)!oVV?1l7TI5QWR6W@iVwX(;-&a5KI-y!nmVRX5=!d z&~P|Xoq2~oh&W590l0?b*#s`}U&7x@T6s1rU}OK?wU1AAmb9~inzni^{HUyw!Meuf zv;cH~%I4xQsrg<^2~ZNR%5)lbW}h^bw>Z>FonzPkgrySV%eflOL+FgvL=(Y%$J~Isqg#3>+KQDYJjw2!N_T?2Y}-EI%Ku~=y8XEpUMj-8C&mS9>G%qhiVo}yQmOdS2$wUV zY~;xWP4Bl2VGpEmND?$3wqm z6oo^`^s%oF6Hb3wiwi1sGTON(>aj6GnKuXjg2@#de7uC`0qL2GZF^In;n*o3nzZh@0BZ_7UqPYhZ7xCr_p-7p$HYN_1GotxD> zR{Q12zd=i@$IuqhNDcXpVT`dk6y0>Dz`K@wv_-p|bk$Ray#}*n>}0fq#;1Q~Kq*@I zQ|Xl0N(hzq-wQ{Q)R>yMCy_6;x?sEg{le1AKX&Z4|K(C+SdW)V{@uQ7%E==RjfMaP z=tqVxtGCvk>AW*TKyvI?0_GDb<$meoB56%lho2 zBumB7aE*N3)%nDb&)-OMXXva4XU1ag<(`7*h8xQUjz(L*Fq+=*HqFhNP_WX-zi~!^ zZDNvt81^E1O;k36ia95TH~x{b=r0l^D1`S7$O=qc3ttoaS zw#{Qqr_3brOP8ErW54G4DKo;$L;-9#u{c!4|9#rjA;zIP<@O)y0vGdFvu)jjOMMOf z2g{BQu9+|#Lq1u!Z>?6W^%b{mD9z9aAV!H#YOuz&gA#3C&#@UM+=p4T+uSp3_12Pa zKr&bMTDbV;%NqjOW`g@3s$L35jlB{hyU(K)ikYIR`$I*3UM$g#y#xxWmTZc+=0tcU z)aZtIDa@Csc%}q(M`URA{yqh*1*R)re9qces98;V5(WnclulMcwzT+sGFRCyB615c z6h=%KHg9|+(xrzIJnLpx-}9*TC2z88J5Qb4afWiy`-5t&ctjCea=2|4dx80Fh!h>6 zHB60c#5vafE5|;lesKc%&(-k^Pu4mY-BOIcJKVN?Y;HFHyR&-FLrDnq>>JIZHcNbo zbQv}nj<2NmYlQ6*3X6|Q9xB#d;oca3ZOSa*WrCo$X40c^_flOFl{6OGYebv5E z0k|0~H)dudrbjjp&UquWD;sNEo(CL8QY*@Ea(nHPOAAKM^*PLCaUUrnzWW0E3XZ00 zp8r0>zDIj@n)uwC0Dbn(PPaA?-OHkEVLaLKz-mc;^0aM{WyH`ku(I?_59SaPG3Q3W zt(mAL{Eyoy9T&bCwP$*bk4fs?-`Xb%9i2VSHYd8tKl8CMl%O}4lSA=;u{{J9#zJN_ zg3HjH8nc+yD>rPE@Od38wepkO=ftgs%%sXCGxxv?TAHuWX=p27xj8;dSlu7e9H$vG_PpRy_pOwV?9hNC;T3L~*gW9tUZ39bj>9_vDz zQm!IhA04Q|y|(ZM!L%G;lhPk-b{LM{+%#P)H6zZKXMc)O;(B8@pqfQ9G7Bl+IjW?2 zF6yNj_7mR*n!{IuEleKZC-#mo?+3Gr4w=UPM%#9)DH=udhadZXoyREGdNg@=+;&Gi z#Nm8nnY;^{<+nw%3T$8YIQA=CygU=FF>ew`4@!%PfN)Y$aQ*yZBTc3zBc}408?pO_ zr&KkGIc(yWi$>A(w%fI3Gi)@&%=oVC(iJ0J@#^JQTmx0btIU|K3^|gGW#InOfjc+_ z`w{1Fx+?PUl#JK^zAv8fOirKdE!jX5M56Y9WP4ME@Lkj@$iD1#h_wG{n3c>CV~gXU zdmgN{&fizNiun-^jzXkZh4n~P9-7zn#ZWGIPr*(&auu^TVP+tXWN(fRm|%Mu@fzhT zjU*TG@dFj&AP*EMcbGGGq9;YxBf1k;*b9}xYaVx?M6oW8# zoyKqze=FFfwb&-#YP2WkWb4M4|A+~yubhG`kJreU_f-1Gn@Q#p=fV+1bU@vbJi+ZrHt%>sEG{>|n84^0Q(wdCJIh<~Dzzd98DyMyoZ zaEtnH)-zbDS+2%(eI{|JW{yKpM`;n{VACH1d`R7G48icfz5ln7Ppn71&^&&5O=$^=ySo0FFm z|JGEx^WRiI;TJ9Mi-|XQc=ptKWFZUs(=7GZ@;?SEQ=G+>S|5g(`ZM4Bb{)@kE8oU_ z>Y}jY>80xB=eomx@siO1@f;u`>3GBCU+u!S!7ftHtl2p!w(&U`s0GAX`F-E5Kb?o~ zAz@4fd0Rd0!>3a1vPLJjvFFQnt%gqC{1(0Z_kD$`<^1cTiBFNNPv4TAf6;#O?{99E zEL&`#Und#?wJ-eB3A6)aPlOzGsOaTaCGUt|`apLiXJP%aL=q}Ltxs@H=S*<>Yo5v2 zwG`YG;I6?Jz?amGH=PjDC93q8x3RU@ayWRA|L)DY-qZL15)PQ?02iI!CltjqtT4h? zSMp>(aj(pJ3+T?#M$1$7Zw&!RnwU=T1t#**%G)#yP|^R)24j9 zn5O#sWY6A2ZKuQ^?quoV=E3`ScDN&AbB1zOfDZgY9ib)Ua#h@gVJ`&G}4F zlSm`vt5{z|_B5m=DHJT274wTcSf>x1qmL zU-8evC)~4%&%V*tpAAO~NA>~0Se8-Gpeq-iYk6Gw75k1?kl0@B>anTHWGmdp!-t)? zhrb*@mvJG9{2kaBBPkNGG0H^mk-wNvVQr;Yi)Vokj*0kZ?oUb6Gn#K0p;WYiERPv=Z_3=3Sbi(2ZloM_EUijPY2OgoUHgc+5ebl7ml1L_saO7q zk4($YK~cuy*l{*yEbyW8hN=$2w-%?p=j}=G%|(mrFZ!%*_zSmAu<d|}UlUH$OI_eD6;*J&?#hGOdB;|d)215FcP6SIl zG`X}Ek7q3deGn5QZ6h~y`o_Ob`sQ3FJ)RNh2oQJHQ*4NOS8S~k_0JotyWAms!5l4P+qs=bDhEJvPtba5K$cesBa&swb z@}}1}=K+7m&2b{SY~2rSf23{a1wFDMr(64(;+Y=qE?2FzcY-f8SF*YG_k!FS>eDjh z`$S^$PK|bL9h&Sg>{w8rjkvWk&Dm0^ULA{}J2Bc=g>Bfw-nfVH9qn%{6W1`t$EY)f z%kA}gqEenOo?OmOJ>6c~;9zDbI!ED_l_K50Rtz$d_IG|${gb_$dNe&1?C?{k&{vk@ z^m2~w1t`*Ia|Wn%ZvWAjat=M#le~0tTp%e{Mvks@@4+K%ES#QC+DuZVsBDQ99bxFD;K}?=UV>iin*M6p%!S1K z`hvh}S@iFu&bD4;CIs0x$_c1j?LFBLHtbHwYzO8oc*oU{;~B=f7%|hERoxJuDO{dV z7gDc+aB4$(EUV3!UI|LATPJcsJl?1PlZMOM?qz*(Z}icTpqD6H@xmYbk!mtHH?e3j zY2us=J9w<-Y+zk=lewhQA-IWXwq$0?=xpL+&%UZC58RnPalZO|w;X*NPhz_(X=rf- zik6UsU^yB-17RN6r)np-h*F*Z@Vdd)hOpVQ;@j>ZLwk9On6%}=F{F}i!^1;*ZFszD z*{?r&)A9eIp%nNKEXk+Aj}zLooWz6ncoR@5r`Z7Du;U?{3tW&h~g05KhJ zs3C0YMe%KwuFrJs=Hv*qdw3=#e&jiF2QX8B&fV%Ja7Yj}ODJ`sR~@+`Q1b0!fW3eC;6l f-^iHH^OWI}j=x`)3o6H(f&pl^^{%`f?q~lORLnk1 literal 0 HcmV?d00001 diff --git a/competitions/getting-started/titanic/titanic-data-science-solutions.md b/competitions/getting-started/titanic/titanic-data-science-solutions.md new file mode 100644 index 00000000..4687c080 --- /dev/null +++ b/competitions/getting-started/titanic/titanic-data-science-solutions.md @@ -0,0 +1,3592 @@ + +# 《泰坦尼克号》数据科学解决方案 + +--- + +### 我已经发布了一个新的 Python 包 [Speedml](https://speedml.com), 它将该 notebook 中的使用的技术编译成一个 intuitive(直观的),powerful(功能强大的)且 productive(高效的)API. + +### Speedml 帮助我在 Kaggle 排行榜上从最低的 80% 跳到最高的 20%, 迭代的次数很少. + +### 还有一件事...Speedml 实现了这一点, 代码行数减少了近 70%! + +### 下载并且运行代码 [Speedml 版本的泰坦尼克号解决方案](https://github.com/Speedml/notebooks/blob/master/titanic/titanic-solution-using-speedml.ipynb). + +--- + +该 notebook 是 [Data Science Solutions](https://startupsci.com) 书籍的一个手册. 该 notebook 引导我们通过一个典型的工作流程来解决像 Kaggle 这样类似的网站的数据科学竞赛. + +有几个优秀的 notebooks 可以用来研究数据科学竞赛作品. +然而许多手册将会跳过一些关于如何开发解决方案的解释, 因为这些 notebooks 是专门为这些专家开发的. +该 notebook 的目标是遵循一步一步的工作流程, 解释我们在解决方案开发过程中所做的每一个决策的每个步骤和理由. + +## 工作流阶段 + +1. 问题或问题的定义. +2. 获取 training(训练)和 testing(测试)数据. +3. Wrangle(整理), prepare(准备), cleanse(清洗)数据 +4. Analyze(分析), identify patterns 以及探索数据. +5. Model(模型), predict(预测)以及解决问题. +6. Visualize(可视化), report(报告)和提出解决问题的步骤以及最终解决方案. +7. 提供或提交结果. + +该工作流指出了,每个阶段如何遵循另一个阶段的常见顺序. +但是也有例外的场景. + +- 我们可能结合多个工作流阶段. 我们可以通过可视化数据进行分析. +- 比 indicated(说明)更早的进行一个阶段. 我们可能在 wrangling(整理)过程的前后来分析数据. +- 在我们的工作流程中多次执行一个阶段. 可视化阶段可能被使用多次. +- Drop a stage altogether. We may not need supply stage to productize or service enable our dataset for a competition. + + +## 问题和问题定义 + +像 Kaggle 这样的竞赛网站, 它们会定义要解决或质疑的问题, 同时提供用于训练数据科学模型和根据测试数据集测试模型结果的数据集,(即, 训练集 和 测试集). +针对《泰坦尼克号生存竞赛》的问题或定义在 [这里是 Kaggle 描述](https://www.kaggle.com/c/titanic) 中有描述. + +> 从泰坦尼克号的灾难中幸存下来或没有幸存的乘客的样本训练集(train.csv)中,如果测试数据集(test.csv)中的这些乘客幸存下来,我们的模型是否可以基于给定的测试数据集(test.csv)来确定。 + +我们也可能希望对我们问题的领域有所了解. +这在 [Kaggle 竞赛描述](https://www.kaggle.com/c/titanic) 页面有详细的描述. +以下是要注意的事项. + +- 1912年4月15日, 在首航期间, 泰坦尼克号撞上一座冰山后沉没, 2224 名乘客和机组人员中有 1502 人遇难. 生成率解释为 32%. +- 还难导致生命损失的原因之一是没有足够的救生艇给乘客和船员. +- 尽管幸存下来的运气有一些因素, 但一些人比其他人更有可能幸存下来,比如妇女, 儿童和上层阶级. + +## 工作流目标 + +数据科学解决方案工作流程有以下七个主要的目标. + +**Classifying(分类).** 我们可能想对我们的样本进行分类或加以类别. 我们也可能想要了解不同类别与解决方案目标的含义或相关性. + +**Correlating(相关).** 可以根据训练数据集中的可用特征来处理这个问题. 数据集中的哪些特征对我们的解决方案目标有重大贡献?从统计学上讲, 特征和解决方案的目标中有一个[相关](https://en.wikiversity.org/wiki/Correlation)?随着特征值的改变, 解决方案的状态也会随之改变, 反之亦然?这可以针对给定数据集中的数字和分类特征进行测试. 我们也可能想要确定以后的目标和工作流程阶段的生存以外的特征之间的相关性. 关联某些特征可能有助于创建, 完善或纠正特征。 + +**Converting(转换).** 对于建模阶段, 需要准备数据. 根据模型算法的选择, 可能需要将所有特征转换为数值等价值. 所以例如将文本分类值转换为数字的值. + +**Completing(完整).** 数据准备也可能要求我们估计一个特征中的任何缺失值. 当没有缺失值时,模型算法可能效果最好. + +**Correcting(校正).** 我们还可以分析给定的训练数据集以找出错误或者可能在特征内不准确的值, 并尝试对这些值进行校正或排除包含错误的样本. 一种方法是检测样本或特征中的任何异常值. 如果对分析没有贡献, 或者可能会显着扭曲结果, 我们也可能完全丢弃一个特征. + +**Creating(创建).** 我们可以根据现有特征或一组特征来创建新特征, 以便新特征遵循 correlation(相关), conversion(转换), completeness(完整)的目标. + +**Charting(绘图).** 如何根据数据的性质和解决方案的目标来选择正确的可视化图表工具以及绘图. + +## 重构的发布日期 2017年1月29日 + +We are significantly refactoring the notebook based on (a) comments received by readers, (b) issues in porting notebook from Jupyter kernel (2.7) to Kaggle kernel (3.5), and (c) review of few more best practice kernels. + +### 用户评论 + +- Combine training and test data for certain operations like converting titles across dataset to numerical values. (thanks @Sharan Naribole) +- Correct observation - nearly 30% of the passengers had siblings and/or spouses aboard. (thanks @Reinhard) +- Correctly interpreting logistic regresssion coefficients. (thanks @Reinhard) + +### 移植问题 + +- Specify plot dimensions, bring legend into plot. + + +### 最佳实践 + +- 在项目早期进行特征相关分析. +- 为了可读性, 使用多个图而不是覆盖图. + + +```python +# 数据分析和整理 +import pandas as pd +import numpy as np +import random as rnd + +# 可视化 +import seaborn as sns +import matplotlib.pyplot as plt +%matplotlib inline + +# 机器学习 +from sklearn.linear_model import LogisticRegression +from sklearn.svm import SVC, LinearSVC +from sklearn.ensemble import RandomForestClassifier +from sklearn.neighbors import KNeighborsClassifier +from sklearn.naive_bayes import GaussianNB +from sklearn.linear_model import Perceptron +from sklearn.linear_model import SGDClassifier +from sklearn.tree import DecisionTreeClassifier +``` + +## 获取数据 + +Python 的 Pandas 包帮助我们处理我们的数据集. +我们首先将训练和测试数据集收集到 Pandas DataFrame 中. +我们还将这些数据集组合在一起, 在两个数据集上运行某些操作. + + +```python +train_df = pd.read_csv('../input/train.csv') +test_df = pd.read_csv('../input/test.csv') +combine = [train_df, test_df] +``` + +## 通过 describing(描述)数据进行分析 + +在我们的项目早期, Pandas 还帮助描述回答数据集中的以下问题. + +**数据集中哪些特征是可用的?** + +注意: 直接操作或分析这些特征的名称. +这些特征名称在 [Kaggle 数据页面](https://www.kaggle.com/c/titanic/data) 页面上有描述. + + +```python +print(train_df.columns.values) +``` + + ['PassengerId' 'Survived' 'Pclass' 'Name' 'Sex' 'Age' 'SibSp' 'Parch' + 'Ticket' 'Fare' 'Cabin' 'Embarked'] + + +**哪些特征是 categorical(分类的)?** + +这些值将样本分成几组相似的样本. +在分类特征中的值是 nominal(标称的), ordinal(顺序的)或 ratio(比例的)还是 interval based(基于区间的)值? +除此之外, 这有助于我们选择合适的图表进行可视化. + +- Categorical(分类的): Survived, Sex, and Embarked. Ordinal(顺序的): Pclass. + +**哪些特征是 numerical(数值的)?** + +哪些特征是数值的? +这些值随样本而变化. +在数值特征中的值是 discrete(离散的)和 continuous(连续的) 还是 timeseries based(基于时间序列的)? + +- Continous(连续的): Age, Fare. Discrete(离散的): SibSp, Parch. + + +```python +# 预览数据 +train_df.head() +``` + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
PassengerIdSurvivedPclassNameSexAgeSibSpParchTicketFareCabinEmbarked
0103Braund, Mr. Owen Harrismale22.010A/5 211717.2500NaNS
1211Cumings, Mrs. John Bradley (Florence Briggs Th...female38.010PC 1759971.2833C85C
2313Heikkinen, Miss. Lainafemale26.000STON/O2. 31012827.9250NaNS
3411Futrelle, Mrs. Jacques Heath (Lily May Peel)female35.01011380353.1000C123S
4503Allen, Mr. William Henrymale35.0003734508.0500NaNS
+
+ + + +**哪些特征是混合的数据类型?** + +相同特征中的 numerical(数值的), alphanumeric(字母数值的). +这些是校正目标的候选特征. + +- Ticket 是numerical(数值的)和 alphanumeric(字母数值的)数据类型的混合类型. Cabin 是 alphanumeric(字母数值的). + +**哪些特征也许包含错误或拼写错误?** + +对于一个大型的数据集来说, 这是很难审查的, 但是从较小的数据集中查看一些样本可能会直接告诉我们, 哪些特征可能需要校正. + +- Name 特征也许包含错误或拼写错误, 因为有几种方法可以用来描述名称, 包括头衔,圆括号和用于替代或短名称的引号. + + +```python +train_df.tail() +``` + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
PassengerIdSurvivedPclassNameSexAgeSibSpParchTicketFareCabinEmbarked
88688702Montvila, Rev. Juozasmale27.00021153613.00NaNS
88788811Graham, Miss. Margaret Edithfemale19.00011205330.00B42S
88888903Johnston, Miss. Catherine Helen "Carrie"femaleNaN12W./C. 660723.45NaNS
88989011Behr, Mr. Karl Howellmale26.00011136930.00C148C
89089103Dooley, Mr. Patrickmale32.0003703767.75NaNQ
+
+ + + +**哪些特征包含 blank(空格), null(无效的)或 empty values(空值)?** + +这些将需要校正. + +- Cabin > Age > Embarked features contain a number of null values in that order for the training dataset. +- Cabin > Age are incomplete in case of test dataset. + +**各个特征的数据类型是什么样的?** + +在转换的目标时可以帮助我们. + +- 7 个特征是 integer 或 floats. 6 个在测试数据集中. +- 5 个特征是 strings (object). + + +```python +train_df.info() +print('_'*40) +test_df.info() +``` + + + RangeIndex: 891 entries, 0 to 890 + Data columns (total 12 columns): + PassengerId 891 non-null int64 + Survived 891 non-null int64 + Pclass 891 non-null int64 + Name 891 non-null object + Sex 891 non-null object + Age 714 non-null float64 + SibSp 891 non-null int64 + Parch 891 non-null int64 + Ticket 891 non-null object + Fare 891 non-null float64 + Cabin 204 non-null object + Embarked 889 non-null object + dtypes: float64(2), int64(5), object(5) + memory usage: 83.6+ KB + ________________________________________ + + RangeIndex: 418 entries, 0 to 417 + Data columns (total 11 columns): + PassengerId 418 non-null int64 + Pclass 418 non-null int64 + Name 418 non-null object + Sex 418 non-null object + Age 332 non-null float64 + SibSp 418 non-null int64 + Parch 418 non-null int64 + Ticket 418 non-null object + Fare 417 non-null float64 + Cabin 91 non-null object + Embarked 418 non-null object + dtypes: float64(2), int64(4), object(5) + memory usage: 36.0+ KB + + +**样本中数值特征值的分布是什么?** + +这有助于我们确定, 除了其他早期的思考, 在实际问题领域的训练数据集是如何具有代表性的. + +- 总样本是 891 或者在泰坦尼克号(2,224)上实际旅客的 40%. +- Survived(生存)是一个具有 0 或 1 值的分类特征. +- 大约 38% 样本幸存了下来, 然而实际的幸存率是 32%. +- 大多数旅客 (> 75%) 没有和父母或孩子一起旅行. +- 近 30% 的旅客有兄弟姐妹 和/或 配偶. +- 少数旅客 Fares(票价)差异显著 (<1%), 最高达 $512. +- 很少有年长的旅客 (<1%) 在年龄范围 65-80. + + +```python +train_df.describe() +# Review survived rate using `percentiles=[.61, .62]` knowing our problem description mentions 38% survival rate. +# Review Parch distribution using `percentiles=[.75, .8]` +# SibSp distribution `[.68, .69]` +# Age and Fare `[.1, .2, .3, .4, .5, .6, .7, .8, .9, .99]` +``` + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
PassengerIdSurvivedPclassAgeSibSpParchFare
count891.000000891.000000891.000000714.000000891.000000891.000000891.000000
mean446.0000000.3838382.30864229.6991180.5230080.38159432.204208
std257.3538420.4865920.83607114.5264971.1027430.80605749.693429
min1.0000000.0000001.0000000.4200000.0000000.0000000.000000
25%223.5000000.0000002.00000020.1250000.0000000.0000007.910400
50%446.0000000.0000003.00000028.0000000.0000000.00000014.454200
75%668.5000001.0000003.00000038.0000001.0000000.00000031.000000
max891.0000001.0000003.00000080.0000008.0000006.000000512.329200
+
+ + + +**分类特征的分布是什么样的?** + +- Names(名称)特征在数据集中是唯一的 (count=unique=891) +- Sex(性别)变量有两个可能的值, 男性为 65% (top=male, freq=577/count=891). +- Cabin(房间号)值在样本中有重复. 或者几个旅客共享一个客舱. +- Embarked(出发港)有 3 个可能的值. 大多数乘客使用 S 港口(top=S) +- Ticket(船票号码)特征有很高 (22%) 的重复值 (unique=681). + + +```python +train_df.describe(include=['O']) +``` + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
NameSexTicketCabinEmbarked
count891891891204889
unique89126811473
topMitchell, Mr. Henry Michaelmale1601C23 C25 C27S
freq157774644
+
+ + + +### 基于数据分析的假设 + +到目前为止, 基于数据分析, 我们得出以下假设. +在采取适当的行动之前, 我们可能会进一步验证这些假设. + +**Correlating(相关).** + +我们想知道每个特征与生存相关的程度. +我们希望在项目早期做到这一点, 并将这些快速相关性与项目后期的模型相关性相匹配. + +**Completing(完整).** + +1. 我们可能想要去补全丢失的 Age(年龄)特征,因为它肯定与生存相关. +2. 我们也想要去补全丢失的 Embarked(出发港)特征, 因为它也可能与生存或者其它重要的特征相关联. + +**Correcting(校正).** + +1. Ticket(船票号码)特征可能会从我们的分析中删除, 因为它包含了很高的重复比例 (22%), 并且票号和生存之间可能没有关联. +2. Cabin(房间号)特征可能因为高度不完整而丢失, 或者在 训练和测试数据集中都包含许多 null 值. +3. PassengerId(旅客ID)可能会从训练数据集中删除, 因为它对生存来说没有贡献. +4. Name(名称)特征是比较不规范的, 可能不直接影响生产, 所以也许会删除. + +**Creating(创建).** + +1. 我们可能希望创建一个名为 Family 的基于 Parch 和 SibSp 的新特征,以获取船上家庭成员的总数. +2. 我们可能想要设计 Name 功能以将 Title 抽取为新特征. +3. 我们可能要为 Age(年龄)段创建新的特征. 这将一个连续的数字特征转变为一个顺序的分类特征. +4. 如果它有助于我们的分析, 我们也可能想要创建 Fare(票价)范围的特征。 + +**Classifying(分类).** + +根据前面提到的问题描述, 我们也可以增加我们的假设. + +1. Women (Sex=female) 更有可能幸存下来. +2. Children (Age 0.5). 我们决定在我们的模型中包含这个特征. +- **Sex** 在 Sex=female(性别=女性)的问题定义中确认了74%(分类#1)的幸存率非常高的观察意见. +- **SibSp and Parch** 这些特征对于某些值具有零相关性. 从这些单独的特征(创建#1)派生一个特征或一组特征可能是最好的 + + +```python +train_df[['Pclass', 'Survived']].groupby(['Pclass'], as_index=False).mean().sort_values(by='Survived', ascending=False) +``` + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + +
PclassSurvived
010.629630
120.472826
230.242363
+
+ + + + +```python +train_df[["Sex", "Survived"]].groupby(['Sex'], as_index=False).mean().sort_values(by='Survived', ascending=False) +``` + + + + +
+ + + + + + + + + + + + + + + + + + + + + +
SexSurvived
0female0.742038
1male0.188908
+
+ + + + +```python +train_df[["SibSp", "Survived"]].groupby(['SibSp'], as_index=False).mean().sort_values(by='Survived', ascending=False) +``` + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
SibSpSurvived
110.535885
220.464286
000.345395
330.250000
440.166667
550.000000
680.000000
+
+ + + + +```python +train_df[["Parch", "Survived"]].groupby(['Parch'], as_index=False).mean().sort_values(by='Survived', ascending=False) +``` + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ParchSurvived
330.600000
110.550847
220.500000
000.343658
550.200000
440.000000
660.000000
+
+ + + +## 通过可视化数据进行分析 + +现在我们可以继续使用可视化分析数据来确认我们的一些假设. + +### 关联数值的特征 + +让我们从理解数值的特征和解决方案目标(生存)之间的相关性开始. + +柱状图可用于分析连续的数字变量,如 Age(年龄),其中条带或范围将有助于识别有用的模式. +直方图可以使用自动定义的 bins 或等分范围的 bins 来说明样本的分布. +这有助于我们回答有关特定频段的问题(婴儿有更好的幸存率吗?) + +请注意,直方图可视化中的 x 轴表示样本或旅客的数量. + +**Observations(观察).** + +- 婴儿(4 岁以下)存活率高. +- 最老的乘客(年龄= 80)幸存下来. +- 大量的 15-25 岁的孩子没有幸. +- 大多数乘客在 15-35 年龄范围内. + +**Decisions(决策).** + +这个简单的分析证实了我们的假设, 作为后续工作流程阶段的决策. + +- 在我们的模型训练中, 我们应该考虑年龄(我们假设分类#2). +- 完成空值的年龄功能(完成#1). +- 我们应该 band(组合)年龄组(创建#3). + + +```python +g = sns.FacetGrid(train_df, col='Survived') +g.map(plt.hist, 'Age', bins=20) +``` + + + + + + + + + +![png](image/titanic_output_24_1.png) + + +### 关联数字和顺序的特征 + +我们可以结合多个特征使用一个图来确定其相关性. +这可以通过具有数字值的数字和分类特征来完成。 + +**Observations(观察).** + +- Pclass=3 拥有最多的乘客,但大多数没有生存. 确认我们的分类假设 #2. +- Pclass=2 和 Pclass = 3 的婴儿乘客大多存活. 进一步限定了我们的分类假设 #2. +- Pclass=1 的大多数乘客幸存下来。 确认我们的分类假设 #3。 +- Pclass 在乘客的年龄分布方面有所不同. + +**Decisions(决策).** + +- 考虑 Pclass 用于模型训练. + + +```python +# grid = sns.FacetGrid(train_df, col='Pclass', hue='Survived') +grid = sns.FacetGrid(train_df, col='Survived', row='Pclass', size=2.2, aspect=1.6) +grid.map(plt.hist, 'Age', alpha=.5, bins=20) +grid.add_legend(); +``` + + +![png](image/titanic_output_26_0.png) + + +### 关联分类特征 + +现在我们可以将分类特征与我们的解决方案目标关联起来. + +**Observations(观察).** + +- Female(女性)旅客的幸存率比 male(男性)好得多. 确认分类(#1)。 +- Embarked= C 的例外, 其中男性的成活率较高. 这可能是 Pclass 和 Embarked 之间的相关性, 反过来, Pclass 和 Survived 之间, 不一定是进入和生存直接相关。 +- 与 C 和 Q 港口的 Pclass = 2 相比, Pclass = 3 时男性的生存率更高. 完成(#2)。 +- 出发港口的 Pclass=3 和男性乘客的生存率不同. 相关(#1)。 + +**Decisions(决策).** + +- 增加 Sex 特征以用于模型训练. +- 补全丢失值并添加 Embarked 特征以用于模型训练. + + +```python +# grid = sns.FacetGrid(train_df, col='Embarked') +grid = sns.FacetGrid(train_df, row='Embarked', size=2.2, aspect=1.6) +grid.map(sns.pointplot, 'Pclass', 'Survived', 'Sex', palette='deep') +grid.add_legend() +``` + + + + + + + + + +![png](image/titanic_output_28_1.png) + + +### 关联分类和数值的特征 + +我们也可能想要关联分类特征(非数值的)和数值的特征. +我们可以考虑将 Embarked(类别非数字), Sex(类别非数字), Fare(数字连续)与生存(分类数字)相关联. + +**Observations(观察).** + +- Higher fare paying passengers had better survival. Confirms our assumption for creating (#4) fare ranges. +- Port of embarkation correlates with survival rates. Confirms correlating (#1) and completing (#2). + +- 更高的票价付费旅客有更好的生存. 证实我们对创造(#4)票价范围的假设. +- 搭乘港口与生存率相关. 确认关联(#1)和完成(#2). + +**Decisions(决策).** + +- 考虑 banding(绑定)票价功能 + + +```python +# grid = sns.FacetGrid(train_df, col='Embarked', hue='Survived', palette={0: 'k', 1: 'w'}) +grid = sns.FacetGrid(train_df, row='Embarked', col='Survived', size=2.2, aspect=1.6) +grid.map(sns.barplot, 'Sex', 'Fare', alpha=.5, ci=None) +grid.add_legend() +``` + + + + + + + + + +![png](image/titanic_output_30_1.png) + + +## 整理数据 + +我们收集了关于我们的数据集和解决方案要求的一些假设和决策. +到目前为止, 我们没有必要改变一个单个的特征或值来达到目标. +让我们现在执行我们的决定和假设来 correcting(校正), creating(创建)和 completing(完整)目标. + +### 通过删除特征进行校正 + +这是一个很好的开始执行目标. 通过丢弃特征, 我们正在处理更少的数据点. 加快我们的 notebook, 并简化分析. + +根据我们的假设和决策, 我们要放弃 Cabin(房间号)(更正#2)和 Ticket(票号)(更正#1)的特征. + +请注意, 如果适用, 我们将对训练和测试数据集进行操作, 以保持一致. + + +```python +print("Before", train_df.shape, test_df.shape, combine[0].shape, combine[1].shape) + +train_df = train_df.drop(['Ticket', 'Cabin'], axis=1) +test_df = test_df.drop(['Ticket', 'Cabin'], axis=1) +combine = [train_df, test_df] + +"After", train_df.shape, test_df.shape, combine[0].shape, combine[1].shape +``` + + Before (891, 12) (418, 11) (891, 12) (418, 11) + + + + + + ('After', (891, 10), (418, 9), (891, 10), (418, 9)) + + + +### 从现在的提取以创建性特征 + +我们想要分析一下, Name 特征是否可以被设计来提取 titles(头衔)和 test(测试)头衔和 survival(生存)之间的相关性, 然后再删除Name 和 PassengerId 特征. + +在下面的代码中, 我们使用正则表达式提取 Title 特征. 正则表达式`(\w+\.)`匹配 Name 特征中以点号字符结尾的第一个单词. +`expand = False` 标志返回一个 DataFrame. + +**Observations(观察).** + +当我们绘制出 Title, Age 和 Survived 的图时, 我们可以发现以下观察. + +- 大多数 titles band 年龄组准确. 例如: 硕士学位的年龄平均为 5 年。 +- Title 中的生存年龄段略有不同. +- 某些 Title 大多存活(夫人, 女士, 先生)或不(Don, Rev, Jonkheer). + +**Decision(决策).** + +- 我们决定保留模型训练的新 Title 特征. + + +```python +for dataset in combine: + dataset['Title'] = dataset.Name.str.extract(' ([A-Za-z]+)\.', expand=False) + +pd.crosstab(train_df['Title'], train_df['Sex']) +``` + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Sexfemalemale
Title
Capt01
Col02
Countess10
Don01
Dr16
Jonkheer01
Lady10
Major02
Master040
Miss1820
Mlle20
Mme10
Mr0517
Mrs1250
Ms10
Rev06
Sir01
+
+ + + +我们可以用更常见的头衔来替换很多头衔, 或者将它们分类为 `Rare`. + + +```python +for dataset in combine: + dataset['Title'] = dataset['Title'].replace(['Lady', 'Countess','Capt', 'Col',\ + 'Don', 'Dr', 'Major', 'Rev', 'Sir', 'Jonkheer', 'Dona'], 'Rare') + + dataset['Title'] = dataset['Title'].replace('Mlle', 'Miss') + dataset['Title'] = dataset['Title'].replace('Ms', 'Miss') + dataset['Title'] = dataset['Title'].replace('Mme', 'Mrs') + +train_df[['Title', 'Survived']].groupby(['Title'], as_index=False).mean() +``` + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
TitleSurvived
0Master0.575000
1Miss0.702703
2Mr0.156673
3Mrs0.793651
4Rare0.347826
+
+ + + +我们可以将 titles(头衔)转换为顺序的. + + +```python +title_mapping = {"Mr": 1, "Miss": 2, "Mrs": 3, "Master": 4, "Rare": 5} +for dataset in combine: + dataset['Title'] = dataset['Title'].map(title_mapping) + dataset['Title'] = dataset['Title'].fillna(0) + +train_df.head() +``` + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
PassengerIdSurvivedPclassNameSexAgeSibSpParchFareEmbarkedTitle
0103Braund, Mr. Owen Harrismale22.0107.2500S1
1211Cumings, Mrs. John Bradley (Florence Briggs Th...female38.01071.2833C3
2313Heikkinen, Miss. Lainafemale26.0007.9250S2
3411Futrelle, Mrs. Jacques Heath (Lily May Peel)female35.01053.1000S3
4503Allen, Mr. William Henrymale35.0008.0500S1
+
+ + + +现在我们可以放心地从训练和测试数据集中删除 Name 特征. +我们也不需要训练数据集中的 PassengerId 特征. + + +```python +train_df = train_df.drop(['Name', 'PassengerId'], axis=1) +test_df = test_df.drop(['Name'], axis=1) +combine = [train_df, test_df] +train_df.shape, test_df.shape +``` + + + + + ((891, 9), (418, 9)) + + + +### 转换分类的特征 + +现在我们可以将包含字符串的特征转换为数字值. +这是大多数模型算法所要求的. +这样做也将帮助我们实现特征完成目标. +让我们开始将 Sex(性别)特征转换为名为 Gender(性别)的新特征, 其中 female=1, male=0. + + +```python +for dataset in combine: + dataset['Sex'] = dataset['Sex'].map( {'female': 1, 'male': 0} ).astype(int) + +train_df.head() +``` + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
SurvivedPclassSexAgeSibSpParchFareEmbarkedTitle
003022.0107.2500S1
111138.01071.2833C3
213126.0007.9250S2
311135.01053.1000S3
403035.0008.0500S1
+
+ + + +### 完整化数值字连续特征 + +现在我们应该开始估计和完成缺少或空值的特征. +我们将首先为 Age(年龄)特征执行此操作. + +我们可以考虑三种方法来完整化一个数值连续的特征. + +1.简单的方法是在平均值和 [标准偏差](https://en.wikipedia.org/wiki/Standard_deviation) 之间生成随机数. + +2.更准确地猜测缺失值的方法是使用其他相关特征. 在我们的例子中, 我们注意到 Age(年龄), Sex(性别)和 Pclass 之间的相关性. 猜测年龄值使用 [中位数](https://en.wikipedia.org/wiki/Median) Age 中的各种 Pclass 和 Gender 特征组合的值. 因此, Pclass=1 和 Gender=0,Pclass=1 和 Gender=1 的年龄中位数等等... + +3.结合方法 1 和 2. 因此. 不要根据中位数来猜测年龄值, 而应根据 Pclass 和 Sex 组合, 使用平均数和标准差之间的随机数. + +方法 1 和 3 将在我们的模型中引入随机噪声. 多次执行的结果可能会有所不同. 我们更喜欢方法 2. + + +```python +# grid = sns.FacetGrid(train_df, col='Pclass', hue='Gender') +grid = sns.FacetGrid(train_df, row='Pclass', col='Sex', size=2.2, aspect=1.6) +grid.map(plt.hist, 'Age', alpha=.5, bins=20) +grid.add_legend() +``` + + + + + + + + + +![png](image/titanic_output_44_1.png) + + +让我们开始准备一个空数组, 以包含基于 Pclass x Gender 组合以猜测 Age 值. + + +```python +guess_ages = np.zeros((2,3)) +guess_ages +``` + + + + + array([[ 0., 0., 0.], + [ 0., 0., 0.]]) + + + +现在我们迭代 Sex(0 或 1)和 Pclass(1, 2, 3)来计算 6 个组合的 Age 的猜测值. + + +```python +for dataset in combine: + for i in range(0, 2): + for j in range(0, 3): + guess_df = dataset[(dataset['Sex'] == i) & \ + (dataset['Pclass'] == j+1)]['Age'].dropna() + + # age_mean = guess_df.mean() + # age_std = guess_df.std() + # age_guess = rnd.uniform(age_mean - age_std, age_mean + age_std) + + age_guess = guess_df.median() + + # Convert random age float to nearest .5 age + guess_ages[i,j] = int( age_guess/0.5 + 0.5 ) * 0.5 + + for i in range(0, 2): + for j in range(0, 3): + dataset.loc[ (dataset.Age.isnull()) & (dataset.Sex == i) & (dataset.Pclass == j+1),\ + 'Age'] = guess_ages[i,j] + + dataset['Age'] = dataset['Age'].astype(int) + +train_df.head() +``` + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
SurvivedPclassSexAgeSibSpParchFareEmbarkedTitle
003022107.2500S1
1111381071.2833C3
213126007.9250S2
3111351053.1000S3
403035008.0500S1
+
+ + + +让我们创建年龄段并确定与 Survived 的相关性. + + +```python +train_df['AgeBand'] = pd.cut(train_df['Age'], 5) +train_df[['AgeBand', 'Survived']].groupby(['AgeBand'], as_index=False).mean().sort_values(by='AgeBand', ascending=True) +``` + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
AgeBandSurvived
0(-0.08, 16.0]0.550000
1(16.0, 32.0]0.337374
2(32.0, 48.0]0.412037
3(48.0, 64.0]0.434783
4(64.0, 80.0]0.090909
+
+ + + +让我们使用年龄段的顺序值来替换 Aage. + + +```python +for dataset in combine: + dataset.loc[ dataset['Age'] <= 16, 'Age'] = 0 + dataset.loc[(dataset['Age'] > 16) & (dataset['Age'] <= 32), 'Age'] = 1 + dataset.loc[(dataset['Age'] > 32) & (dataset['Age'] <= 48), 'Age'] = 2 + dataset.loc[(dataset['Age'] > 48) & (dataset['Age'] <= 64), 'Age'] = 3 + dataset.loc[ dataset['Age'] > 64, 'Age'] +train_df.head() +``` + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
SurvivedPclassSexAgeSibSpParchFareEmbarkedTitleAgeBand
00301107.2500S1(16.0, 32.0]
111121071.2833C3(32.0, 48.0]
21311007.9250S2(16.0, 32.0]
311121053.1000S3(32.0, 48.0]
40302008.0500S1(32.0, 48.0]
+
+ + + +我们不能删除 AgeBand 特征. + + +```python +train_df = train_df.drop(['AgeBand'], axis=1) +combine = [train_df, test_df] +train_df.head() +``` + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
SurvivedPclassSexAgeSibSpParchFareEmbarkedTitle
00301107.2500S1
111121071.2833C3
21311007.9250S2
311121053.1000S3
40302008.0500S1
+
+ + + +### 结合现有特征创建新特征 + +我们可以为 Parch 和 SibSp 结合的 FamilySize 创建一个新的特征. +这将使我们能够从我们的数据集中删除 Parch 和 SibSp. + + +```python +for dataset in combine: + dataset['FamilySize'] = dataset['SibSp'] + dataset['Parch'] + 1 + +train_df[['FamilySize', 'Survived']].groupby(['FamilySize'], as_index=False).mean().sort_values(by='Survived', ascending=False) +``` + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
FamilySizeSurvived
340.724138
230.578431
120.552795
670.333333
010.303538
450.200000
560.136364
780.000000
8110.000000
+
+ + + +我们可以创建另一个名为 IsAlone 特征. + + +```python +for dataset in combine: + dataset['IsAlone'] = 0 + dataset.loc[dataset['FamilySize'] == 1, 'IsAlone'] = 1 + +train_df[['IsAlone', 'Survived']].groupby(['IsAlone'], as_index=False).mean() +``` + + + + +
+ + + + + + + + + + + + + + + + + + + + + +
IsAloneSurvived
000.505650
110.303538
+
+ + + +让我们放弃 Parch, SibSp 和 FamilySize 特征, 转而使用 IsAlone 特征. + + +```python +train_df = train_df.drop(['Parch', 'SibSp', 'FamilySize'], axis=1) +test_df = test_df.drop(['Parch', 'SibSp', 'FamilySize'], axis=1) +combine = [train_df, test_df] + +train_df.head() +``` + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
SurvivedPclassSexAgeFareEmbarkedTitleIsAlone
003017.2500S10
1111271.2833C30
213117.9250S21
3111253.1000S30
403028.0500S11
+
+ + + +我们还可以创建一个结合 Pclass 和 Age 的人造特征. + + +```python +for dataset in combine: + dataset['Age*Class'] = dataset.Age * dataset.Pclass + +train_df.loc[:, ['Age*Class', 'Age', 'Pclass']].head(10) +``` + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Age*ClassAgePclass
0313
1221
2313
3221
4623
5313
6331
7003
8313
9002
+
+ + + +### 完整化分类特征 + +Embarked(出发港)特征有 S, Q, C 三个基于出发港口的值. +我们的训练集有两个丢失值. +我们简单的使用最常发生的情况来填充它. + + +```python +freq_port = train_df.Embarked.dropna().mode()[0] +freq_port +``` + + + + + 'S' + + + + +```python +for dataset in combine: + dataset['Embarked'] = dataset['Embarked'].fillna(freq_port) + +train_df[['Embarked', 'Survived']].groupby(['Embarked'], as_index=False).mean().sort_values(by='Survived', ascending=False) +``` + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + +
EmbarkedSurvived
0C0.553571
1Q0.389610
2S0.339009
+
+ + + +### 转换分类特征为数值的 + +我们现在可以通过创建一个新的数字港特征来转换 EmbarkedFill 特征. + + +```python +for dataset in combine: + dataset['Embarked'] = dataset['Embarked'].map( {'S': 0, 'C': 1, 'Q': 2} ).astype(int) + +train_df.head() +``` + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
SurvivedPclassSexAgeFareEmbarkedTitleIsAloneAge*Class
003017.25000103
1111271.28331302
213117.92500213
3111253.10000302
403028.05000116
+
+ + + +### 快速完整化兵转换数值的特征 + +现在,我们可以在测试数据集使用模式下为单个缺失值完整化票价特征, 以获取此特征最常出现的值. 我们用一行代码来完成. + +请注意, 我们并没有创建中间用的新特征, 也没有对相关性进行任何进一步的分析以猜测丢失的特征, 因为我们只替换单个值. 完成目标达到了模型算法对非空值操作的期望要求. + +我们可能还想把票价四舍五入到小数点后两位, 因为它代表货币. + + +```python +test_df['Fare'].fillna(test_df['Fare'].dropna().median(), inplace=True) +test_df.head() +``` + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
PassengerIdPclassSexAgeFareEmbarkedTitleIsAloneAge*Class
08923027.82922116
18933127.00000306
28942039.68752116
38953018.66250113
489631112.28750303
+
+ + + +我们不创建 FareBand 特征. + + +```python +train_df['FareBand'] = pd.qcut(train_df['Fare'], 4) +train_df[['FareBand', 'Survived']].groupby(['FareBand'], as_index=False).mean().sort_values(by='FareBand', ascending=True) +``` + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
FareBandSurvived
0(-0.001, 7.91]0.197309
1(7.91, 14.454]0.303571
2(14.454, 31.0]0.454955
3(31.0, 512.329]0.581081
+
+ + + +将 Fare 特征转换为基于 FareBand 的顺序值. + + +```python +for dataset in combine: + dataset.loc[ dataset['Fare'] <= 7.91, 'Fare'] = 0 + dataset.loc[(dataset['Fare'] > 7.91) & (dataset['Fare'] <= 14.454), 'Fare'] = 1 + dataset.loc[(dataset['Fare'] > 14.454) & (dataset['Fare'] <= 31), 'Fare'] = 2 + dataset.loc[ dataset['Fare'] > 31, 'Fare'] = 3 + dataset['Fare'] = dataset['Fare'].astype(int) + +train_df = train_df.drop(['FareBand'], axis=1) +combine = [train_df, test_df] + +train_df.head(10) +``` + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
SurvivedPclassSexAgeFareEmbarkedTitleIsAloneAge*Class
0030100103
1111231302
2131110213
3111230302
4030210116
5030112113
6010330113
7030020400
8131110303
9121021300
+
+ + + +并且测试数据集也一样. + + +```python +test_df.head(10) +``` + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
PassengerIdPclassSexAgeFareEmbarkedTitleIsAloneAge*Class
089230202116
189331200306
289420312116
389530110113
489631110303
589730010110
689831102213
789920120102
890031101313
990130120103
+
+ + + +## 模型, 预测和解决方案 + +现在我们准备训练模型并通过训练得到的模型预测结果。有60多种用于预测的模型可供选择。我们必须了解问题的类型和解决方案的要求,将模型数量缩小到少数几个。我们的问题是分类和回归问题,因为需要确定输出(生存与否)与其他变量或特征(性别,年龄,港口...)之间的关系。此外,我们的问题应该属于监督学习,因为我们用已知类别的数据集来训练我们的模型。有了监督学习、分类和回归这两个标准,我们可以将模型选择的范围缩小到几个。这些包括: +- Logistic回归 +- KNN或K—近邻 +- 支持向量机 +- 朴素贝叶斯分类器 +- 决策树 +- 随机森林 +- 感知器 +- 人工神经网络 +- 相关向量机 + + + +```python +X_train = train_df.drop("Survived", axis=1) +Y_train = train_df["Survived"] +X_test = test_df.drop("PassengerId", axis=1).copy() +X_train.shape, Y_train.shape, X_test.shape +``` + + + + + ((891, 8), (891,), (418, 8)) + + + +Logistic回归形式简单,易于建模,适合用于早期的工作流程。Logistics回归使用线性回归模型的预测结果去逼近真实标记的对数几率,形式为参数化的Logistics分布。参考维基百科[Wikipedia](https://en.wikipedia.org/wiki/Logistic_regression). + +注意模型产生的“置信度评分”是基于训练集的。 + + +```python +# Logistic Regression + +logreg = LogisticRegression() +logreg.fit(X_train, Y_train) +Y_pred = logreg.predict(X_test) +acc_log = round(logreg.score(X_train, Y_train) * 100, 2) +acc_log +``` + + + + + 80.359999999999999 + + + +我们可以使用Logistic回归来验证我们之前对特征的创建所做的假设。这可以通过计算决策函数中的特征的系数来完成。 + +系数为正说明该特征增加了结果的对数几率(因而增加了概率),系数为负说明该特征降低了结果的对数几率(从而降低了概率) + +- Sex特征有最高的正系数,意味着当Sex从男(0)变成女(1)时,Survived = 1的概率增加最多。 +- 相反地,随着Pclass特征的增加,Survived = 1的概率减少的最多。 +- Age * Class是一个很好的人造特征,因为它与Survived具有次高的负相关性。 +- Title特征有第二高的正相关系数。 + + +```python +coeff_df = pd.DataFrame(train_df.columns.delete(0)) +coeff_df.columns = ['Feature'] +coeff_df["Correlation"] = pd.Series(logreg.coef_[0]) + +coeff_df.sort_values(by='Correlation', ascending=False) +``` + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
FeatureCorrelation
1Sex2.201527
5Title0.398234
2Age0.287164
4Embarked0.261762
6IsAlone0.129140
3Fare-0.085150
7Age*Class-0.311199
0Pclass-0.749006
+
+ + + +接下来,我们使用支持向量机(SVM)模型。支持向量机是一个监督学习模型,它使用相关学习算法来分析数据,可以用于分类和回归问题。在二元分类的情况下,SVM算法建立一个模型,去找两类训练样本“正中间”的划分超平面,因为该划分超平面对训练样本局部扰动的“容忍性”最好。参考维基百科。[Wikipedia](https://en.wikipedia.org/wiki/Support_vector_machine). + +注意SVM模型生成的“置信度评分”高于Logistics回归模型。 + + +```python +# Support Vector Machines + +svc = SVC() +svc.fit(X_train, Y_train) +Y_pred = svc.predict(X_test) +acc_svc = round(svc.score(X_train, Y_train) * 100, 2) +acc_svc +``` + + + + + 83.840000000000003 + + + +在模式识别中,k-近邻算法(简称k-NN)是一种用于分类和回归的无参数方法。测试样本找出训练集中与其最靠近的k个训练样本,选择这k个样本中出现最多的类别标记作为预测结果(k是一个正整数,通常很小)。如果k = 1,则该对象的类别和最近邻样本的类别一致。 参考维基百科。[Wikipedia](https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm). + +KNN的“置信度评分”比Logistics回归好,但比SVM差。 + + +```python +knn = KNeighborsClassifier(n_neighbors = 3) +knn.fit(X_train, Y_train) +Y_pred = knn.predict(X_test) +acc_knn = round(knn.score(X_train, Y_train) * 100, 2) +acc_knn +``` + + + + + 84.739999999999995 + + + +在机器学习中,朴素贝叶斯分类器是一个基于所有特征互相独立的贝叶斯理论的简单概率分类器。朴素贝叶斯分类器具有高度可扩展性,在学习过程中需要大量的线性特征作为参数。参考维基百科。[Wikipedia](https://en.wikipedia.org/wiki/Naive_Bayes_classifier). + +该模型生成的“置信度评分”是目前模型中最低的。 + + +```python +# Gaussian Naive Bayes + +gaussian = GaussianNB() +gaussian.fit(X_train, Y_train) +Y_pred = gaussian.predict(X_test) +acc_gaussian = round(gaussian.score(X_train, Y_train) * 100, 2) +acc_gaussian +``` + + + + + 72.280000000000001 + + + +感知器是用于二元分类器的监督学习的算法(可以决定包含一个向量的输入是否属于某个类别)。它是一种线性分类器,即一种分类算法,通过一个线性预测函数将一组权重与特征向量组合来进行预测。该算法允许在线学习,因为它在一次迭代中只处理一个训练集中的元素。 参考维基百科。[Wikipedia](https://en.wikipedia.org/wiki/Perceptron). + + +```python +# Perceptron + +perceptron = Perceptron() +perceptron.fit(X_train, Y_train) +Y_pred = perceptron.predict(X_test) +acc_perceptron = round(perceptron.score(X_train, Y_train) * 100, 2) +acc_perceptron +``` + + D:\Anaconda\Anaconda3\lib\site-packages\sklearn\linear_model\stochastic_gradient.py:128: FutureWarning: max_iter and tol parameters have been added in in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3. + "and default tol will be 1e-3." % type(self), FutureWarning) + + + + + + 78.0 + + + + +```python +# Linear SVC + +linear_svc = LinearSVC() +linear_svc.fit(X_train, Y_train) +Y_pred = linear_svc.predict(X_test) +acc_linear_svc = round(linear_svc.score(X_train, Y_train) * 100, 2) +acc_linear_svc +``` + + + + + 79.120000000000005 + + + + +```python +# Stochastic Gradient Descent + +sgd = SGDClassifier() +sgd.fit(X_train, Y_train) +Y_pred = sgd.predict(X_test) +acc_sgd = round(sgd.score(X_train, Y_train) * 100, 2) +acc_sgd +``` + + D:\Anaconda\Anaconda3\lib\site-packages\sklearn\linear_model\stochastic_gradient.py:128: FutureWarning: max_iter and tol parameters have been added in in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3. + "and default tol will be 1e-3." % type(self), FutureWarning) + + + + + + 80.019999999999996 + + + +该模型使用决策树作为预测模型,将特征(树的分支)映射到决策结果(树的叶结点)。目标变量是有限的一组值的树称为分类树; 在这些树结构中,叶结点对应于决策结果,其他每个结点对应于一个属性测试,每个结点包含的样本集合根据属性测试的结果被划分到子结点中。目标变量可以取连续值(通常是实数)的决策树称为回归树。参考维基百科。[Wikipedia](https://en.wikipedia.org/wiki/Decision_tree_learning). + +该模型的“置信度评分”是目前模型中最高的。 + + +```python +# Decision Tree + +decision_tree = DecisionTreeClassifier() +decision_tree.fit(X_train, Y_train) +Y_pred = decision_tree.predict(X_test) +acc_decision_tree = round(decision_tree.score(X_train, Y_train) * 100, 2) +acc_decision_tree +``` + + + + + 86.760000000000005 + + + +随机森林是最流行的模型之一。随机森林或随机决策树森林是一种用于分类,回归或其他任务的集成学习模型,它通过在训练时构造大量的决策树(n_estimators = 100),再使用某种策略将这些“个体学习器”结合起来。参考维基百科。[Wikipedia](https://en.wikipedia.org/wiki/Random_forest). + +该模型的“置信度评分”是目前模型中最高的。我们决定使用这个模型的输出(Y_pred)来作为竞赛结果。 + + +```python +# Random Forest + +random_forest = RandomForestClassifier(n_estimators=100) +random_forest.fit(X_train, Y_train) +Y_pred = random_forest.predict(X_test) +random_forest.score(X_train, Y_train) +acc_random_forest = round(random_forest.score(X_train, Y_train) * 100, 2) +acc_random_forest +``` + + + + + 86.760000000000005 + + + +### 模型评估 + +现在, 我们可以对所有模型进行评估, 为我们的问题选择最好的模型。 +虽然决策树和随机森林评分相同, 但我们选择使用随机森林,因为随机森林会校正决策树“过拟合”的缺点。 + + +```python +models = pd.DataFrame({ + 'Model': ['Support Vector Machines', 'KNN', 'Logistic Regression', + 'Random Forest', 'Naive Bayes', 'Perceptron', + 'Stochastic Gradient Decent', 'Linear SVC', + 'Decision Tree'], + 'Score': [acc_svc, acc_knn, acc_log, + acc_random_forest, acc_gaussian, acc_perceptron, + acc_sgd, acc_linear_svc, acc_decision_tree]}) +models.sort_values(by='Score', ascending=False) +``` + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ModelScore
3Random Forest86.76
8Decision Tree86.76
1KNN84.74
0Support Vector Machines83.84
2Logistic Regression80.36
6Stochastic Gradient Decent80.02
7Linear SVC79.12
5Perceptron78.00
4Naive Bayes72.28
+
+ + + + +```python +submission = pd.DataFrame({ + "PassengerId": test_df["PassengerId"], + "Survived": Y_pred + }) +# submission.to_csv('../output/submission.csv', index=False) +``` + +我们提交给竞赛网站 Kaggle 的比赛结果在 6,082 个参赛作品中获得 3883 名. +当竞赛正在进行时,这个结果是具有指导意义的. +这个结果只占提交数据集的一部分. +对我们的第一次尝试是不错的. +欢迎任何提高我们的分数的建议. + +## 参考文献 + +该手册是基于完成解决《泰坦尼克号》竞赛和其它来源的伟大工作而创建的. + +- [泰坦尼克号之旅](https://www.kaggle.com/omarelgabry/titanic/a-journey-through-titanic) +- [ Pandas 入门指南: Kaggle 的泰坦尼克号竞赛](https://www.kaggle.com/c/titanic/details/getting-started-with-random-forests) +- [泰坦尼克号的最佳处理分类器](https://www.kaggle.com/sinakhorami/titanic/titanic-best-working-classifier)