Automatic Text Summarization
News
A Moodle profile was set up for the course. Please sign up there and try to find a group. We will use this platform from now on for any announcement.
The room changed to C 110 (13:30-15:10).
Content
The automatic generation of summaries from a collection of different types of texts on one subject is a current field of research, which is intensively researched at the Aiphes graduate school at the TU-Darmstadt, for example.
Different methods are used for this purpose, which originate from both machine learning and natural language processing.
In this practical course, students will be given the opportunity to familiarize themselves with these methods in small groups, to develop extensions and new methods and to apply them to a real data set. This dataset will contain a set of documents associated to a list of topics. In a second phase the participants will evaluate the summaries produced by other groups.
A more detailed overview can be found in the kick-off slides.
Prerequisites and Registration
Completion of a lecture in machine learning, data mining, or natural language processing. Practical experience with data mining or NLP tools is helpful, but can also be acquired independently. Good to very good knowledge of written English.
For further questions feel free to send an email to autots@ke.tu-darmstadt.de. No prior registration is needed, however, please stlll send us an email so that we are able to estimate beforehand the number of participants, and have your E-mail address for possible announcements. Also make sure that you are registered in TUCaN.
Who, when and where?
The kick-off meeting will take place on Friday, 13th of April at 13:30 in A313. Please note that participation in this event is mandatory.
The regular meeting will take place approximatelly every two or three weeks on Fridays, 13:30 in C110.
Tentative Schedule
Evaluation
The solution will be created in small groups. Your commitment to the course and the quality of your solution will be assessed. In addition, there will be presentations and written submissions, which will also be included in the evaluation.
Forum
The forum is intended to serve as a platform for participants to ask questions and exchange ideas and results. We have set up a Moodle for this purpose.
Literature
- TBA
Tools
Some useful tools around summarization:
- NTLK: Powerful library for natural language processing (Python)
- scikit-learn: Framework for machine learning (Python)
- sumy: Various implementations of summarization algorithms, and links to many more (Python)
Links
- TBA
Kontakt
Margot Mieskes, Eneldo Loza Mencía