Seminar aus Maschinellem Lernen und Data Mining

Representation Learning

The seminar is available in TUCaN under module number 20-00-0102.

The paper assignments can be found below. Please remember to send your slides one week before the presentation in order to get feedback.

When and Where?

The seminar meetings will be on Tuesdays at 17:10 in E202. The kick-off meeting is on Tuesday, April 25. Please note the different room assignment and kick-off date as in TUCaN.

Content

In the course of this seminar we will try to get an overview on the current state of research in a domain. This year's topic will be Representation Learning, i.e. methods that learn to transform data into a representation which can be used by subsequent algorithms. We will cover both important traditional work and recent papers published in workshops, journals, and conferences.

Schedule

(25.4.) Kick-off Meeting

(9.5.) Principal Components Analysis and Independent Component Analysis

Christoph M:
Abdi. H., & Williams, L.J. (2010). "Principal component analysis". Wiley Interdisciplinary Reviews: Computational Statistics. 2 (4): 433–459.
Nils J.:
Hyvärinen, Aapo; Erkki Oja (2000). "Independent Component Analysis: Algorithms and Applications". Neural Networks. 4-5. 13 (4–5): 411–430.

(16.5.) Probabilistic Topic Models

Jonas K.:
Thomas Hofmann: Probabilistic Latent Semantic Analysis. UAI 1999: 289-296
Thomas Hofmann: Probabilistic Latent Semantic Indexing. SIGIR 1999: 50-57
Stefan H.:
Blei, David M.; Ng, Andrew Y.; Jordan, Michael I (January 2003). Lafferty, John, ed. "Latent Dirichlet Allocation". Journal of Machine Learning Research. 3 (4–5): pp. 993–1022.

(23.5.) Matrix Factorization

Nils B.:
Lee, Daniel D., and H. Sebastian Seung. "Learning the parts of objects by non-negative matrix factorization." Nature 401.6755 (1999): 788-791.
Lee, Daniel D., and H. Sebastian Seung. "Algorithms for non-negative matrix factorization." Advances in neural information processing systems. 2001.
Peter J.:
Pauli Miettinen: Sparse Boolean Matrix Factorizations. ICDM 2010: 935-940

(30.5.) Compressed Sensing

Alexy G.:
Donoho, David L. "Compressed sensing." IEEE Transactions on information theory 52.4 (2006): 1289-1306.
Florian C.:
Daniel HSU , Sham KAKADE , John LANGFORD , Tong ZHANG : Multi-Label Prediction via Compressed Sensing. In: Advances in Neural Information Processing Systems 22, pp. 772–780. Curran Associates, 2009b.

(6.6.) Local and Global Embeddings

Shaohong L.:
Sam T. Roweis and Lawrence K. Saul, Nonlinear dimensionality reduction by locally linear embedding, Science (2000)
Lawrence K. Saul, Sam T. Roweis: Think Globally, Fit Locally: Unsupervised Learning of Low Dimensional Manifold. Journal of Machine Learning Research 4: 119-155 (2003)
Philippe S.:
J. B. Tenenbaum, V. de Silva, J. C. Langford, A Global Geometric Framework for Nonlinear Dimensionality Reduction, Science 290, (2000), 2319–2323
Vin de Silva, Joshua B. Tenenbaum: Global Versus Local Methods in Nonlinear Dimensionality Reduction. NIPS 2002: 705-712

(13.6.) Spectral and Locality sensitive Hashing

Raffael L.:
Yair WEISS , Antonio B. TORRALBA , Robert FERGUS : Spectral Hashing. In: Advances in Neural Information Processing Systems (NIPS), pp. 1753–1760. MIT Press, 2008.
Moritz F.:
Loïc Paulevé, Hervé Jégou, Laurent Amsaleg: Locality sensitive hashing: A comparison of hash function types and querying mechanisms. Pattern Recognition Letters 31(11): 1348-1358 (2010)

(20.6.) Autoencoders

Steffen K.:
G. E. Hinton and R. R. Salakhutdinov. Science. 2006 Jul 28;313(5786):504-7.
Ruslan Salakhutdinov, Geoffrey E. Hinton: Deep Boltzmann Machines. AISTATS 2009: 448-455
Matej Z.:
Pascal Vincent, Hugo Larochelle, Isabelle Lajoie, Yoshua Bengio, Pierre-Antoine Manzagol: Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion. Journal of Machine Learning Research 11: 3371-3408 (2010)

(27.6) Generative Learning

Markus B.:
Diederik P. Kingma and Max Welling, Auto-Encoding Variational Bayes, In Proceedings of the International Conference on Learning Representations (ICLR), 2014. https://arxiv.org/pdf/1312.6114.pdf
Robin H.:
Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, Yoshua Bengio, Generative Adversarial Nets, In Advances in Neural Information Processing Systems (NIPS), 2014. http://papers.nips.cc/paper/5423-generative-adversarial-nets.pdf

(4.7.) Semantic Representations

Philipp M.:
Tomas Mikolov, Ilya Sutskever, Kai Chen, Gregory S. Corrado, Jeffrey Dean: Distributed Representations of Words and Phrases and their Compositionality. NIPS 2013: 3111-3119
Tomas Mikolov, Kai Chen, Greg Corrado, Jeffrey Dean: Efficient Estimation of Word Representations in Vector Space. CoRR abs/1301.3781 (2013).
Seyed S.:
Ryan Kiros, Yukun Zhu, Ruslan Salakhutdinov, Richard S. Zemel, Raquel Urtasun, Antonio Torralba, Sanja Fidler: Skip-Thought Vectors. NIPS 2015: 3294-3302

(11.7.) Ladder Networks

Wei D.:
Antti Rasmus, Mathias Berglund, Mikko Honkala, Harri Valpola, Tapani Raiko: Semi-supervised Learning with Ladder Networks. NIPS 2015: 3546-3557
Antti Rasmus, Harri Valpola, Mikko Honkala, Mathias Berglund, Tapani Raiko:
Semi-Supervised Learning with Ladder Network. CoRR abs/1507.02672 (2015)
Felix G.:
Mohammad Pezeshki, Linxi Fan, Philemon Brakel, Aaron C. Courville, Yoshua Bengio: Deconstructing the Ladder Network Architecture. ICML 2016: 2368-2376

(18.7.) Applications

Aurel K.:
Haoqiang Fan, Mu Yang, Zhimin Cao, Yujing Jiang, Qi Yin: Learning Compact Face Representation: Packing a Face into an int32. ACM Multimedia 2014: 933-936
Haoqiang Fan, Zhimin Cao, Yuning Jiang, Qi Yin, Chinchilla Doudou:
Learning Deep Face Representation. CoRR abs/1403.2802 (2014)
Lisa Z.:
Quoc V. Le, Marc'Aurelio Ranzato, Rajat Monga, Matthieu Devin, Greg Corrado, Kai Chen, Jeffrey Dean, Andrew Y. Ng: Building high-level features using large scale unsupervised learning. ICML 2012
Quoc V. Le, Ju Han, Joe W. Gray, Paul T. Spellman, Alexander Borowsky, Bahram Parvin: Learning invariant features of tumor signatures. ISBI 2012: 302-305
Quoc V. Le, Will Y. Zou, Serena Y. Yeung, Andrew Y. Ng: Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis. CVPR 2011: 3361-3368
Lovis H.:
Sebastian Böck, Markus Schedl: Polyphonic piano note transcription with recurrent neural networks. ICASSP 2012: 121-124
Siddharth Sigtia, Emmanouil Benetos, Nicolas Boulanger-Lewandowski, Tillman Weyde, Artur S. d'Avila Garcez, Simon Dixon: A hybrid recurrent neural network for music transcription. ICASSP 2015: 2061-2065

Organization

The topics for the talks will be assigned in the kick-off meeting. Do not miss the kick-off meeting if you want to participate in the seminar. The language used in the seminar will be English.

It is not necessary to have prior knowledge, but prior knowledge in data mining and machine learning will be helpful. Participation is limited to 20 students. In case we have more students, students with prior knowledge in data mining and knowledge discovery will be preferred. The selection will be made at kick-off meeting.

The students are expected to give a 30 minute talk on the material they are assigned, followed by 15 minutes of feedback and questions. Although each topic is typically associated with a single paper, the point of the talk is not to exactly reproduce the entire contents of the paper, but to communicate the key ideas of the methods that are introduced in the paper. Thus, the content of the talk should exceed the scope of the paper, and demonstrate that a thorough understanding of the material was achieved. See also our general advices on giving talks.

For further questions feel free to send an email to ml-sem@ke.tu-darmstadt.de. No prior registration is needed, however, please still send us an email so that we are able to estimate beforehand the number of participants, and have your E-mail address for possible announcements. Also make sure that you are registered in TUCaN.

Talks

The talks are expected to be accompanied by slides. The students will have to send the slides one week in advance to the talk to ml-sem@ke.tu-darmstadt.de. We will use this opportunity to provide early feedback on common problems such as too many slides, too much text on the slides, small font sizes, etc. The talk and the slides should be in English.

There will be two talks in each meeting. As mentioned above, each topic is associated with one paper, but the talk should not exactly reproduce the content of the paper, but communicate the key ideas of the introduced method.

All papers should be freely available on the internet or in the ULB. Note that some paper sources such as Springer link often only works on campus networks (sometimes not even via VPN). If you cannot find a paper, contact us.

Grading

The slides, the presentation and the question and answers section of the talk will influence the overall grade. Furthermore, it is expected that students actively participate in the discussions, and this will also be part of the final grade.

We do not expect a written summary of the material except for the slides.

To achieve a grade in the 1.x range, the talk needs to exceed the contentual recitation of the given material and include own ideas, own experience or even demos. An exact recitation of the papers will lead to a grade in the 2.x range. A weak presentation and lack of engagement in the discussions may lead to a grade in the 3.x range, or worse. Please read also very carefully our guidelines for giving a talk.

In addition to the grading, we will also give public feedback on the talks immediately after the talks, and we are considering a best presentation award at the end of the seminar.

Topics

Here is a list of topics, each topic consists of two seminar talks (indicated by the bullet list). For each seminar talk, we give 1-3 papers as a starting point. However, note that you are not supposed to reproduce the papers in all details. For most talks, you should explain the method that is introduced in the paper(s), and show where and how it can be used. Often you will find much better examples or use cases in later publications on these methods. See also our guidelines for giving a talk.