Introduction to AI. Math in Machine Learning seminar (MiML) McGill Math and Stats (McMaS)

Size: px

Start display at page:

Download "Introduction to AI. Math in Machine Learning seminar (MiML) McGill Math and Stats (McMaS)"

Marvin Alexander
5 years ago
Views:

1 Introduction to AI Math in Machine Learning seminar (MiML) McGill Math and Stats (McMaS)

2 Background AI Artificial Intelligence is loosely defined as intelligence exhibited by machines Operationally: R&D in CS academic sub-disciplines: Computer Vision, Natural Language Processing (NLP), Robotics, etc

3 Artificial General Intelligence (AGI) AI : specific tasks, AGI : general cognitive abilities. (AGI) is a small research area within AI: build machines that can successfully perform any task that a human might do On account of this ambitious goal, AGI has high visibility, disproportionate to its size or present level of success, among futurists, science fiction writers, and the public.

4 Perspectives on Research in Artificial Intelligence and Artificial General Intelligence Relevant to DoD taken from a study by JASON for the Department of Defence (Dod)

5 Historical Context AI coined in Perceptrons 1960 implied machines could learn from data Decline in perceptron no universal function approximator - only linear discriminator, can t learn XOR 1980 resurgence in AI expert systems. Learning rules. Petered out 1990s academic AI in doldrums Improved computers led, in 1997 to IBM Deep Blue beats champion Gary Kasparov in chess. Chess, once believed to require human intelligence, fell to a special-purpose very fast search algorithm.

6 Don t confuse AI with AGI 1997 NYT in response to Deep Blue: to play a decent game of Go [requires human intelligence] when that happens, will be a sign that AI is as good as the real thing 2016 NYT wrong: Google s AlphaGO beats world Champion Lee Sedol. Did not involve breakthrough - also using hybrid of DNN with massively parallel tree-search and Reinforcement Learning DNN require massive amounts of data which can be found labelled on the internet, or in the databases of private companies, like Facebook or Google, generated from a fast computer playing a lifetime of games

7 2010: Deep Learning Revolution Neural Networks have been around for half a century. Popular in the 1990 s for solving simple tasks. Starting around 2010, new hardware, Graphics Processor Units (GPU)s, became available, which allowed for much larger, and deeper networks. large labelled data sets become available, allowed for training.

2010: Deep Learning Revolution The large data set ImageNet was available in 2005. In 2012 Alexnet, trained on GPUs, won the 2012 ImageNet competition, with an error of 15.

8 2010: Deep Learning Revolution The large data set ImageNet was available in In 2012 Alexnet, trained on GPUs, won the 2012 ImageNet competition, with an error of 15.3%, more that 10% better than the runner up. Canadian (U Toronto) team: Alex Krizhevsky, Geoﬀrey Hinton, and Ilya Sutskever. Between 2011 and 2015, error rate for image captioning by computer fell from 25% to 3%, better than accepted human figure of 5% more than 95% prediction correct caption (green column)

9 DNN and Images

10 DNN exceed human performance in: some kinds of image recognition spoken word recognition the game of Go (long thought to require generalized human intelligence AGI) self-driving cars: now more limited by policy than tech.

11 rapidly advancing areas Reinforcement Learning graphical and Bayes models, esp. with probability programming models generative models (creating artificial images) more likely DL will become essential building block of a hybrid approach

12 Reinforcement Learning Learn how to play Atari from raw image pixels. Learn how to beat Go champions (Huge branching factor) Robots learning how to walk Big in Montreal: Google DeepMind and Microsoft Research both work in this area

13 Generative Models A generative model takes a input random vectors and outputs realistic images (of a certain class)

14 Generative Models discriminator: has learned what a picture looks like. generator: tries to generate a believable picture.

15 Hardware Both training (finding the best weights) and inference (evaluating the output of the network on a data point) are computationally intensive. Eﬀort is measured in Joules. Hardware, software, and algorithms have evolved together. Currently, need to use Graphics Processing Units (GPU)s, rather than CPUs.

16 Hardware Want to do inference on mobile devices, need custom architectures, and custom hardware. Currently engineering practice to take trained networks and make them smaller. Also build custom chips with power source Research problem: design and train architectures with efficient inference in mind

17 Machine Learning vs DL Traditional Machine Learning (ML) can t compete with the raw performance of Deep Learning amazon.com However ML has performance guarantees which are important in the many applications where errors are costly.

18 Error Estimates Using probability (Central Limit Theorem) and linear or parametric models, can fit data, and also estimate the probability of an error Deep Learning models lack these estimates on errors!

19 <latexit sha1_base64="3mrmvwbthhzx8q0rthpq9fwqi7o=">aaab/3icbvbns8naej3ur1q/ooixl4tfeiss9kixoedfywx7aw0im+2mxbqbhn2ngmip/huvhpti1b/hzx/jthxr1gcdj/dmmjkxjjwp7tifvmfpewv1rbhe2tjc2t6xd/eakk4loq0s81i2a6wozxftaky5bsesyhfw2gqglxo/duulynf0o7oeegl3ixyygrwrfpsgqxeoq1kb7nyg7k2dosbnvl12ks4u6ie486rcq2nua4c6b390ezfjby004vipjltntjdjqrnhdftqpoommaxxn3ymjbcgysun94/qsvf6kiylquijqfp7isdcquweplngpvdz3kt8z+ukojz3chylqayrms0ku450jczhob6tlgiegykjzozwrazyyqjnzcutwslli6rzrbhoxb02avrhhiicwhgcgatnuimrqemdcdzae7zaq/vopvtj623wwrc+z/bhd6z3l3yxld8=</latexit> <latexit sha1_base64="kifqgqw5gdu6inuwanqn5k7mjb0=">aaab/3icbvdlssnafj3uv62vqodgzwarbkek3ehgkajisoj9qbvczdjph85mwsxedbelf8wnc6w49qvcu/nvnlyi2nrgwugce7n3nibhvgnh+bqkc4tlyyvf1dla+sbmlr2901rxkjfp4jjfsh0grrgvpkgpzqsdsij4wegrgjyp/dynkyrg4lpncfe46gkauyy0kxx7l4nnsktsdm99cu9mhcpap75ddirobpchulokximjepeejuq+/deny5xyijrmskmow020lyopkwzkwoqmiiqid1cpdawvibpl5zp7h/dqkcgmymlkadhrf0/kicuv8cb0cqt7atybi/95nvrhp15orzjqivb0uzqyqgm4dgogvbkswwyiwpkawyhui4mwnpgvtahzl8+tzrxiohx3yqrrbvmuwt44aefabsegbi5bhtqabvfgetydf+vberjg1uu0twb9z+ycp7devgd8xje+</latexit> <latexit sha1_base64="kifqgqw5gdu6inuwanqn5k7mjb0=">aaab/3icbvdlssnafj3uv62vqodgzwarbkek3ehgkajisoj9qbvczdjph85mwsxedbelf8wnc6w49qvcu/nvnlyi2nrgwugce7n3nibhvgnh+bqkc4tlyyvf1dla+sbmlr2901rxkjfp4jjfsh0grrgvpkgpzqsdsij4wegrgjyp/dynkyrg4lpncfe46gkauyy0kxx7l4nnsktsdm99cu9mhcpap75ddirobpchulokximjepeejuq+/deny5xyijrmskmow020lyopkwzkwoqmiiqid1cpdawvibpl5zp7h/dqkcgmymlkadhrf0/kicuv8cb0cqt7atybi/95nvrhp15orzjqivb0uzqyqgm4dgogvbkswwyiwpkawyhui4mwnpgvtahzl8+tzrxiohx3yqrrbvmuwt44aefabsegbi5bhtqabvfgetydf+vberjg1uu0twb9z+ycp7devgd8xje+</latexit> <latexit sha1_base64="tlykp0jeujhmlgyozak1q58na7m=">aaab/3icbvdlssnafl2pr1pfucgnm8eicejjutgnuhdjsoj9qbvczdpph84kywaihtifv+lghsju/q13/o3tb6ktby4czrmxe+8jes6udpwvq7c0vlk6vlwvbwxube/yu3tnfaes0aajeszbavaus4g2nnocthnjsqg4bqxdy7hfuqvssti60vlcpyh7eqszwdpivn2qoqvuvaladz5d96zoueaz3y47fwcc9epcevkggeq+/dntxsqvnnkey6u6bjxrxo6lzottuambkppgmsr92je0woiql5/cp0lhrumhmjamio0m6u+jhaulmhgytoh1qm17y/e/r5pq8nzlwzskmkzkuihmodixgoebekxsonlmccasmvsrgwcjitarluwicy8vkma14jov99op16qzoipwcedwai6cqq2uoa4nipaat/acr9aj9wy9we/t1oi1m9mhp7a+vggnzzrx</latexit> <latexit sha1_base64="cvgmpzvx4hqlf7sowtvlzxz8suo=">aaaccxicbva9swnbej2lxzfqjfralayhaql3abqrajywfomyd0hi2ntskiw7d8funngeaw38kzywsrd1h9jz+vpcjcka+gdg8d4mm/pcgdolbfvdsiwtr6yujddtg5tb6e3mzm5n+aektep87sugixxlzknvztsnjubslfxo6+7wbolxb6huzpeudbtqtsb9j/uywdpinqy6q6eopvhf4ov4kl5ur7kop5eevs1fr3a+k8nabxsk9eocezitpcevtwaodzlvra5pqke9tthwqukua92osdsmcdpktujfa0ygue+bhnpyunwop5+m0kfruqjns1oerlp190smhvkrce2nwhqg5r2j+j/xdhxvpb0zlwg19chsus/ksptoegvqmkmj5pehmehmbkvkgcum2osxmiesvlxiaswcyxecikmjcdmkyr8oiacohemjzqemvsbwd4/wdc/wg/vkja3xwwvc+p7zgz+w3r4aidoamg==</latexit> <latexit sha1_base64="midjfsw67e+258dkov97re2uxtu=">aaaccxicbvdpswjbfj61x2zlwx2dgjjaiwtxs10couuhdhqtcmoyo446olo7zmxg2+kxlv0rxtou0rx/oft/q/9esxpr2gcppr7vpd57nxswkpvlfriphcwl5zx0amztfso7aw5t16qfckwc7dnfnfwkcamecrrvjdqcqrb3gam7w9per18tianvxaooig2o+h7tuyyuljomviunscvpn6or+ikco6n8vegkjm7y0afv6jg5q2hnah+iputy5ey4+nm3n650zpdw18chj57cdenzteubasdikiozgwvaosqbwkpuj01npcsjbmett0bwqctd2poflk/bifp7ikzcyoi7upmjnzczxil+5zvd1ttux9qlqku8pf3ucxlupkxigv0qcfys0grhqfwtea+qqfjp8di6hlmx50mtvlstol3vaztafgmwc/zbhtjgcjtbgagab2bwdx7bm3gxhowny2y8tlttxvfmdvgd4+0l/mmbma==</latexit> <latexit sha1_base64="midjfsw67e+258dkov97re2uxtu=">aaaccxicbvdpswjbfj61x2zlwx2dgjjaiwtxs10couuhdhqtcmoyo446olo7zmxg2+kxlv0rxtou0rx/oft/q/9esxpr2gcppr7vpd57nxswkpvlfriphcwl5zx0amztfso7aw5t16qfckwc7dnfnfwkcamecrrvjdqcqrb3gam7w9per18tianvxaooig2o+h7tuyyuljomviunscvpn6or+ikco6n8vegkjm7y0afv6jg5q2hnah+iputy5ey4+nm3n650zpdw18chj57cdenzteubasdikiozgwvaosqbwkpuj01npcsjbmett0bwqctd2poflk/bifp7ikzcyoi7upmjnzczxil+5zvd1ttux9qlqku8pf3ucxlupkxigv0qcfys0grhqfwtea+qqfjp8di6hlmx50mtvlstol3vaztafgmwc/zbhtjgcjtbgagab2bwdx7bm3gxhowny2y8tlttxvfmdvgd4+0l/mmbma==</latexit> <latexit sha1_base64="v8zsjk0tq8ctlw6tmbrjzh6dnco=">aaaccxicbva9swnben3zm8avu0ubxsakioeujtzcwmbcioqxbjiy9jz7yzldvwn3tzyptdb+frslrwz9b3b+g/esijr4yodx3gwz8/yiuaud58tawfxaxlnnrexxnza3tu2d3boky4mjh0mwyqapfgfuee9tzugzkgrxn5ggpzzl/mytkyqg4loneelw1bc0obhpi3vtea9pyvvrpkc36rw58ebfpjrjhn0vkyon1lulttkza/4qd5yuwbs1rv3z7ou45krozjbslbcs6u6kpkaykvg+hsssitxefdiyvcbovccdfzkch0bpwscuposgy/x3riq4ugn3tsdheqbmvuz8z2vfojjppfressyctxyfmym6hfksseclwzolhiasqbkv4ggscgstxt6empfypklxyq5tdi+dqruyjsmh9sebkaixhimqoac14aemhsateagv1qp1bl1z75pwbws6swf+wpr4bimzl/s=</latexit> Neural Network Architecture y = X w i x i + b i z = ReLU (y) = max(y, 0)

20 Convolutional Neural Nets Deep NN: allows different weights everywhere. Convolutional NN: special case, for images, where weights are nonzero only for nearby neighbors (at the input level and later). In addition, for each layer, the pattern of the weights is the same at every location. Significantly reduces the total number of weights per layer, allowing for much deeper networks.

21 More architecture

22 <latexit sha1_base64="bhdihtjtp20hlfmpsytytweynsc=">aaacdhicbvhlbtnafb2bvzgvfbysyhehrprivwrnqyu2lbqabysikbzsbfnj8xuyynhs5he3svif/b07pomnayypiwi50khnzrmpmxpzrnbtouih59+5e+/+g4ohwapht54+6x0+v9c1vqwnrba1usqprseltgw3aq8ahbtkbv7mi7onfrlepxktv5hvg2lfz5kxnfhjqkz37qnfooscggrhdcbdyxa0r6ehxa3cfrgclqving2tfhdawfx71petn3nokc/mxqw2yzbmg6timuvanxwateni8kvls0hkrrneabisbaumo0gigd00d52oytdmgcrp1utho2gb8afen0gf7om8631piprzcqvhgmo9jcenstuqdgcc10fintaulegmpw5kwqfou61pazhytaflrdyrbrbs3xudrbrevbnlrkiz65vahvyfnrwmpek7lhtrulldonikmdvsngafv8imwdlamelurcdm1llk3j4cz8ktl98gf+nrhi3iz+p+6dnejgpyirwlaxktd+sufctnzeiy+em99mb74/3yx/t9/2ix6nv7mhfkn/bhvwevwlf3</latexit> <latexit sha1_base64="bhdihtjtp20hlfmpsytytweynsc=">aaacdhicbvhlbtnafb2bvzgvfbysyhehrprivwrnqyu2lbqabysikbzsbfnj8xuyynhs5he3svif/b07pomnayypiwi50khnzrmpmxpzrnbtouih59+5e+/+g4ohwapht54+6x0+v9c1vqwnrba1usqprseltgw3aq8ahbtkbv7mi7onfrlepxktv5hvg2lfz5kxnfhjqkz37qnfooscggrhdcbdyxa0r6ehxa3cfrgclqving2tfhdawfx71petn3nokc/mxqw2yzbmg6timuvanxwateni8kvls0hkrrneabisbaumo0gigd00d52oytdmgcrp1utho2gb8afen0gf7om8631piprzcqvhgmo9jcenstuqdgcc10fintaulegmpw5kwqfou61pazhytaflrdyrbrbs3xudrbrevbnlrkiz65vahvyfnrwmpek7lhtrulldonikmdvsngafv8imwdlamelurcdm1llk3j4cz8ktl98gf+nrhi3iz+p+6dnejgpyirwlaxktd+sufctnzeiy+em99mb74/3yx/t9/2ix6nv7mhfkn/bhvwevwlf3</latexit> <latexit sha1_base64="bhdihtjtp20hlfmpsytytweynsc=">aaacdhicbvhlbtnafb2bvzgvfbysyhehrprivwrnqyu2lbqabysikbzsbfnj8xuyynhs5he3svif/b07pomnayypiwi50khnzrmpmxpzrnbtouih59+5e+/+g4ohwapht54+6x0+v9c1vqwnrba1usqprseltgw3aq8ahbtkbv7mi7onfrlepxktv5hvg2lfz5kxnfhjqkz37qnfooscggrhdcbdyxa0r6ehxa3cfrgclqving2tfhdawfx71petn3nokc/mxqw2yzbmg6timuvanxwateni8kvls0hkrrneabisbaumo0gigd00d52oytdmgcrp1utho2gb8afen0gf7om8631piprzcqvhgmo9jcenstuqdgcc10fintaulegmpw5kwqfou61pazhytaflrdyrbrbs3xudrbrevbnlrkiz65vahvyfnrwmpek7lhtrulldonikmdvsngafv8imwdlamelurcdm1llk3j4cz8ktl98gf+nrhi3iz+p+6dnejgpyirwlaxktd+sufctnzeiy+em99mb74/3yx/t9/2ix6nv7mhfkn/bhvwevwlf3</latexit> <latexit sha1_base64="bhdihtjtp20hlfmpsytytweynsc=">aaacdhicbvhlbtnafb2bvzgvfbysyhehrprivwrnqyu2lbqabysikbzsbfnj8xuyynhs5he3svif/b07pomnayypiwi50khnzrmpmxpzrnbtouih59+5e+/+g4ohwapht54+6x0+v9c1vqwnrba1usqprseltgw3aq8ahbtkbv7mi7onfrlepxktv5hvg2lfz5kxnfhjqkz37qnfooscggrhdcbdyxa0r6ehxa3cfrgclqving2tfhdawfx71petn3nokc/mxqw2yzbmg6timuvanxwateni8kvls0hkrrneabisbaumo0gigd00d52oytdmgcrp1utho2gb8afen0gf7om8631piprzcqvhgmo9jcenstuqdgcc10fintaulegmpw5kwqfou61pazhytaflrdyrbrbs3xudrbrevbnlrkiz65vahvyfnrwmpek7lhtrulldonikmdvsngafv8imwdlamelurcdm1llk3j4cz8ktl98gf+nrhi3iz+p+6dnejgpyirwlaxktd+sufctnzeiy+em99mb74/3yx/t9/2ix6nv7mhfkn/bhvwevwlf3</latexit> <latexit sha1_base64="qq9/liwmtihlk7bmx0smtir6pdy=">aaacc3icbvc7sgnbfl3rm8zx1njmsbaigbcbqm2eqboliwjmaxmss5njmmr2dpmznyqlvy2f4c/ywchi6w/y+tdoehfnphdhcm693hupf3kmtg1/wkvlk6tr64mn5obw9s5uam+/qojielohaq9k3cokcizortpnat2ufpsepzvvujr4tvsqfqvetr6ftoxjnmbdrra2kptkd9uxydljdi6gbyfyqkndgzocexy7q3szhr67qyydt6dap8szj5kionm4aycym/podgis+vrowrfsdacq6lampwae03gygskayjlapdowvgcfqly8/wwmjozsqd1amhiatdxfezh2lrr5nun0se6rew8i/uc1it09a8vmhjgmgswwdsoodiamwaaok5ropjiee8nmryj0screm/isjosflxdjtzb37lxzzdiowqwjoiq0zmgbuyjcbzshagtu4bge4cw6t56sv+tt1rpkfc8cwb9y71/wujp9</latexit> <latexit sha1_base64="srpjo9nk3lz6/ikjm3yxgf88ae0=">aaacc3icbvc7sgnbfj2nrxhfuuubiugibmjucruramkslbiwd0g2y+zsjbkyo7vmzbrikt7gt/axbbqusfuh7pwzczkiaokbc4dz7uxee9yqualm88nilc2vrk4l11mbm1vbo+ndvbomiofjdqcsee0xsciojzvffspnubdku4w03ef54jeuija04jdqfblbrz1ouxqjpsunnrl2yp63xvamdjsc5qgnha7bhlkmoun4krseoemswtcngd/emifzejy+qz5+xlec9hvbc3dke64wq1k2rgko7bgjrtej41q7kireeib6pkuprz6rdjz9zqwptelbbib0cqwn6u+jgplsjnxxd/pi9ew8nxh/81qr6p7amevhpajhs0xdieevwekw0kocymvgmiasql4v4j4sccsdx0qhspdyiqkxc5zzsko6jtkyiqkoqabkgavoqamcgwqoaqxuwd14as/grffgvbivs9ae8t2zd/7aepscqb+cmg==</latexit> <latexit sha1_base64="srpjo9nk3lz6/ikjm3yxgf88ae0=">aaacc3icbvc7sgnbfj2nrxhfuuubiugibmjucruramkslbiwd0g2y+zsjbkyo7vmzbrikt7gt/axbbqusfuh7pwzczkiaokbc4dz7uxee9yqualm88nilc2vrk4l11mbm1vbo+ndvbomiofjdqcsee0xsciojzvffspnubdku4w03ef54jeuija04jdqfblbrz1ouxqjpsunnrl2yp63xvamdjsc5qgnha7bhlkmoun4krseoemswtcngd/emifzejy+qz5+xlec9hvbc3dke64wq1k2rgko7bgjrtej41q7kireeib6pkuprz6rdjz9zqwptelbbib0cqwn6u+jgplsjnxxd/pi9ew8nxh/81qr6p7amevhpajhs0xdieevwekw0kocymvgmiasql4v4j4sccsdx0qhspdyiqkxc5zzsko6jtkyiqkoqabkgavoqamcgwqoaqxuwd14as/grffgvbivs9ae8t2zd/7aepscqb+cmg==</latexit> <latexit sha1_base64="purzvfojsyoixn1fec1w6f1xk1w=">aaacc3icbvdlssnafj34rpuvdelmabeqhzj0oxuh0i0lfxxsa9o0tcatduhkemymhhk6d+ovuhghift/wj1/4/sbaoubc4dz7uxee7yyuaks68tyw9/y3nro7er39/ypds2j45ameofje0cseh0pscioj01ffsodwbaueoy0vvf96rfviza04ndqhbmnranoa4qr0pjrftj+xsv2bf7btm9hgfrk5bdhkceqm8kbunrumkwrys0af4i9tipggyzrfvb8ccch4qozjgxxrsbkyzbqfdmyyfcsswker2hauppyfblpzlnfjvbmkz4miqglkzhtf09kkjryhhq6m0rqkje9qfif101ucolklmejihzpfwujgyqc02cgtwxbio01qvhqfsveqyqqvjq+va5h5evv0qpwbkti31rfwn0rrw6cggioartcgbq4bg3qbbg8gcfwal6nr+pzedpe561rxmlmbpyb8fentqmy6g==</latexit> Training Given data x i, labels y(x i )andnetworku(x; w) withweightsw min w L(w) 1 n X sum is over a large number (millions) of data points. Instead approximate sum by a random subset (mini-batch) of hundreds of data points. minimize over weights by stochastic gradient descent (SGD): taking a small step in the gradient direction. Step size is called learning rate. SGD has a faster version: Nesterov s Momentum, which adds a momentum term to the update. The gradient is computed (automatically by software) using the chain rule w n+1 = w n + dt n r w L(w) i `(u(x i ; w),y(x i ))

23 <latexit sha1_base64="7xgexthan5otpeedgk0uhkj/lx0=">aaacb3icbvbns0jbfl3pvsy+rjzbdelgbplew1sbqhat0cigp0dn5o2jds6b95izl4i6a9op6a+0avfe2/5cu/5no0audubyd+fcy8w9xsiz0rb9acxm5hcwl+llizxvtfwn5ozwuqwrjlraah7isocv5uzqgmaa03iokfy9tktejzfys7dukhaik90lac3hlcgajgbtphpy9zzdpucn6gludlgvhorxqkbbd1b3r916mmvn7dhqd3gmssqljh5uacbft35ugwgjfco04vipiuogutbhujpc6tbrjrqnmengfq0ykrbpva0/vmoi9o3sqm1amhiajdxfg33sk9xzptppy91w095i/m+rrlp5uuszeuaacjj5qblxpam0cgu1mkre854hmehm/opig0tmtikuyukyoxmwfn2my2ecs5ngdiaiww7sqrocoiysneeeckdgdh7hgv6se+vjerxejqmx63tng/7aev8cn8oyfg==</latexit> <latexit sha1_base64="6ay+e2rpliaoi69aik1hsh9eqbe=">aaacb3icbvdlsgmxfm3uv62vuzecbitqecrmlnsnuohgxeul9gfthtjp2ozmmkossdrpd278ch/ajqulupux3pkzytqkapxa5r7ouzfkhi9kvcrlejcsc/mli0vj5dtk6tr6hrm5vzzbjdap4yafouohsrjlpksoyqqacoj8j5gk182p/co1ezig/fl1q9lwuzvtfsviack1d88zvqn4ci/g7rdwssgpczgc9aauc+w4ztrkwhpab2lpknqoht0xrx83bdd8qzcdhpmek8yqldxbcvujrkjrzmgwvy8kcrhuojapacqrt2qjntwxhptaacjwihrxbsfqz40y+vl2fu9p+kh15kw3fv/zapfqntriysniey6nd7uibluax6hajhuek9bxbgfb9v8h7icbsnlrpxqif07+s8po1raydlgnkqdtjmeo2amzyinjkannoabkainb8acewmi4mx6nz+nlopowvna2ws8yr58kypqb</latexit> <latexit sha1_base64="6ay+e2rpliaoi69aik1hsh9eqbe=">aaacb3icbvdlsgmxfm3uv62vuzecbitqecrmlnsnuohgxeul9gfthtjp2ozmmkossdrpd278ch/ajqulupux3pkzytqkapxa5r7ouzfkhi9kvcrlejcsc/mli0vj5dtk6tr6hrm5vzzbjdap4yafouohsrjlpksoyqqacoj8j5gk182p/co1ezig/fl1q9lwuzvtfsviack1d88zvqn4ci/g7rdwssgpczgc9aauc+w4ztrkwhpab2lpknqoht0xrx83bdd8qzcdhpmek8yqldxbcvujrkjrzmgwvy8kcrhuojapacqrt2qjntwxhptaacjwihrxbsfqz40y+vl2fu9p+kh15kw3fv/zapfqntriysniey6nd7uibluax6hajhuek9bxbgfb9v8h7icbsnlrpxqif07+s8po1raydlgnkqdtjmeo2amzyinjkannoabkainb8acewmi4mx6nz+nlopowvna2ws8yr58kypqb</latexit> <latexit sha1_base64="vkhpjlkeqzrmybifhjqbizone+0=">aaacb3icbvdlsgmxfm3uv62vuzecbitqecrmbhqjflorcvhbpqadh0yatqgzzjbklgxanrt/xy0lrdz6c+78gzntew09clmhc+4lucepgjxksr6mznlyyupadj23sbm1vwpu7tvkgatmqjhkowj4sbjgoakqqhhprikgwgek7vflqv+/j0lskn+qyutcahu57vcmljy88/cqmdibf/a6baewrsjjwcjhaddyndvhm/nw0zoa/hb7nutbdbxp/gy1qxwhhcvmkjrn24mumychkgzkngvfkkqi91gxndxlkcdstsz3jogxvtqwewpdxmgj+nsjqyguw8dxkwfsptnvpej/xjnwnxm3otykfef4+lanzlcfma0ftqkgwlghjgglqv8kcq8jhjwolqddwdh5kdscom0v7rsrxyrp4sica3aecsagz6aelkefvaegd+ajvibx49f4nt6m9+loxpjt7im/md6+are0lus=</latexit> <latexit sha1_base64="awvnnrfn/h+ajixbwucdpoxt5fs=">aaacbxicbvc7sgnbfl0bxzg+opzadayhiotdfgojbnkiweqwd0iwoduzjenmz5ezwupyplhxk+xtlbsx9r/s/bsniyhgd1zu4zx7mbnhczlt2ry/rmtc/mliuni5tbk6tr6r3tyqqccshjzjwanz87cinala1kxzwgslxb7hadxrfcd+9yzkxqjxpqchdx3ceazncnzgaqz3z7p9a3sklsbtedvoqbgpbbr2h02nmc7yoxsc9e2cwzipokp7awaondpvjvzaip8ktthwqu7kq+3gwgpgob2lgpgiisy93kf1qwx2qxljyrujtg+ufmoh0ptqakl+3iixr9ta98ykj3vxzxpj8t+vhun2irszeuaacjj9qb1xpam0jgs1mkre84ehmehm/opif0tmtakuzul4c/jfusnnhdvnxjo0ijbfenzgd7lgwdeu4axkuayct/aat/bs3vmp1ov1oh1nwf872/al1tsnabax2q==</latexit> <latexit sha1_base64="uxo1olhgseo3jgo0njbjeeyaw0y=">aaacbxicbvdlsgmxfm3uv62vqktdbitqecpmf+pgkhqj4qif+4b2gdjppg3nziyky6ntbtz4fe7dufckw//bnt8jpq2ith643mm595lc44amsmwah0ziyxfpesw5mlpb39jcsm/vvguqcuwqogcbqltieky5qsiqgkmhgidfzatmdotjv3zdhkqbv1b9kng+anpquyyulpz0/mw2dwtp4dw4hcmmcsvlayed3scxnhtgzjktwb9izzjmaz48leeftyun/d5sbtjycveyiskbvj5udoyeopiryaozsrii3evt0tcui59io55cmyshwmlblxc6uiit9fdgjhwp+76rj32konlwg4v/ey1iewd2thkykclx9cevylafcbwjbffbsgj9trawvp8v4g4sccsdxeqhmhfypknmc5azs8o6jskyign2wahiagucggk4acvqarjcgufwdf6me+pjgbmv09ge8b2zc/7aepsc0w6z9g==</latexit> <latexit sha1_base64="uxo1olhgseo3jgo0njbjeeyaw0y=">aaacbxicbvdlsgmxfm3uv62vqktdbitqecpmf+pgkhqj4qif+4b2gdjppg3nziyky6ntbtz4fe7dufckw//bnt8jpq2ith643mm595lc44amsmwah0ziyxfpesw5mlpb39jcsm/vvguqcuwqogcbqltieky5qsiqgkmhgidfzatmdotjv3zdhkqbv1b9kng+anpquyyulpz0/mw2dwtp4dw4hcmmcsvlayed3scxnhtgzjktwb9izzjmaz48leeftyun/d5sbtjycveyiskbvj5udoyeopiryaozsrii3evt0tcui59io55cmyshwmlblxc6uiit9fdgjhwp+76rj32konlwg4v/ey1iewd2thkykclx9cevylafcbwjbffbsgj9trawvp8v4g4sccsdxeqhmhfypknmc5azs8o6jskyign2wahiagucggk4acvqarjcgufwdf6me+pjgbmv09ge8b2zc/7aepsc0w6z9g==</latexit> <latexit sha1_base64="qureyautr9ztow0pxyawnj0me+u=">aaacbxicbvdlsgmxfm34rpu16lixwsjuhdltjw6eqjciliryb7tdkekzbwgmgzkmpbtdupfx3lhqxk3/4m6/mdmw0dydl3s4516se4kyuaud58tawl5zxvvpbgq3t7z3du29/zosicskigutshegrrjlpkqpzqqrs4kigjf60cunfv2eseufv9odmhgr6naauoy0kxz76drfp4wx8cztz7bfykwz4hduh/mub+ecgjmb/chupmmbgsq+/dlqc5xehgvmkfjntxhrb4ikppircbavkbij3emd0jsuo4gobzi5ygxpjnkgozcmuiyt9ffgeevkdalatezid9w8l4r/ec1ehxfekpi40ytj6unhwqawmi0etqkkwlobiqhlav4kcrdjhlujlmtcwdh5kdskbdcpuldorlsexzebh+ay5ielzkejxiekqaimhsateagv1qp1bl1z79prjwu2cwd+wpr4bt/4lky=</latexit> Technical Details Training CNNs is buggy. Gradients can be zero, causing training to stall, or can blow up. Lots of hacks or heuristics used to help. J(w) =L(w)+ w 2 2 J(w) =L(w)+ w 1 Regularization: the loss function is nonconvex. So add an extra term to make it convex, and keep the weights small. Called weight decay. Data augmentation: cutout: change the images at each stage, keeping same labels Dropout: randomly set half the weights to zero at each iteration. (Heuristic which helps) batch normalization: normalize the input data to each neuron, to be mean zero var = 1.

24 Challenges for rigorous deep learning it is not clear that the existing AI paradigm is immediately amenable to any sort of software engineering validation and verification. This is a serious issue, and is a potential roadblock to DoD s use of these modern AI systems, especially when considering the liability and accountability of using AI in lethal systems. JASON report (italics mine)

25 Evolution of engineering discipline

26 Importance of -ilities Reliability maintainability accountability verifiability evolvability attackability

27 Challenge: Adversarial Examples Goodfellow, Explaining and Harnessing Adversarial Examples, 2015

28 Hot current topic: next time we will talk about our progress on it.

(Sub)Gradient Descent

(Sub)Gradient Descent CMSC 422 MARINE CARPUAT marine@cs.umd.edu Figures credit: Piyush Rai Logistics Midterm is on Thursday 3/24 during class time closed book/internet/etc, one page of notes. will include