AIN440 Introduction to Natural Language Processing

BBM495 Introduction to Natural Language Processing

This course is given together with “AIN442 Practicum in Natural Language Processing” / “BBM497 Introduction to Natural Language Processing Laboratory” complementary practicum (laboratory) course.


Semester        2025 Spring
Instructor      :  Ilyas Cicekli  

Email              :  ilyas@cs.hacettepe.edu.tr
Class Hours   :  Wednesday 9:30-12:15

                           Classroom: Seminar Hall 


AIN442 /BBM497:

            Class Hours               :  Wednesday 16:40-18:30

   Classroom: D8 

Research Assistant   :  İsmail Furkan Atasoy

Email                          :  ismailfurkanatasoy@cs.hacettepe.edu.tr


Text Book

1.   Daniel Jurafsky, and James H. Martin, "Speech and Language Processing", Third Edition, Prentice Hall, 2024. 

 

Reference Books

  1. Christopher D. Manning, and Hinrich Schutze, "Foundations of Statistical Natural Language Processing", The MIT Press, 1999.

3.     Bird, Steven, Edward Loper and Ewan Klein, “Natural Language Processing with Python, O’Reilly Media Inc., 2009.


Grading for AIN440/BBM495

Quizzes      : 27% 

Attendance :   3% 

Midterm     : 30%             

Final           : 40% 

 

Quiz Policy:

·     There will be at least 5 pop-up quizzes.

·     Pop-up quizzes will NOT be announced; regular attendance is required in order to take the quizzes.

·     There are NO make-ups for pop-up quizzes.

·     5% of your quiz grade will be based on your lowest quiz score out of the N quizzes, and 95% will be based on your highest N−1 quiz scores.

 

Attendance Policy:

·     Regular attendance is expected, and 3% of your semester grade will be based on your attendance.

·     Attendance will be taken during class hours, and more than one attendance may be recorded on the same day.

 


 

Grading for AIN442/BBM497:

Programming Assignments : 100%          

 

Programming Assignment Policy:

·     There will be at least four programming assignments.

·     Late Policy:

o  You must submit your programming assignments before their due dates.

o  You may submit your assignments up to three days late, but with a penalty. A 10% penalty will be applied for each day late (Penalties: 1 day late: 10%, 2 days late: 20%, 3 days late: 30%)

 


Tentative Course Outline:

 

Week

Subject

Related chapters in 3rd edition of textbook

1

Introduction/Overview of NLP

Ch. 1

2

Regular Expressions, Text Normalization, Edit Distance

Ch. 2

3

N-gram Language Models,

Ch. 3

4

Spelling Correction, Part-of-Speech Tagging

Ch. 8 & Appendix B

5

Text Classification: Naive Bayes

Ch. 4

6

Text Classification: Logistic Regression

Ch. 5

7

Vector Semantics

Ch. 6

 

MIDTERM

 

8

Neural Networks and Neural Language Models

Ch. 7

9

RNNs and LSTMs

Ch. 9

10

Transformers and Large Language Models

Ch. 10

11

Fine-Tuning and Masked Language Models

Ch. 11

12

Morphological Processing

Ch. 3 from 2nd edition of the book

13

Context-Free Grammars and Syntactic Parsing

Ch. 17 and else

14

Statistical Parsing

Ch. 18 and else

 

FINAL EXAM

 

           

 


Lecture Notes:

·       lec01-introduction.pdf

·       lec02-1-BasicTextProcessing.pdf

·       lec02-2-MinimumEditDistance.pdf

·       lec03-LanguageModels.pdf

·       lec04-1-SpellingCorrection.pdf

·       lec04-2-PartOfSpeechTagging.pdf

·       lec05-TextClassificationNaiveBayes.pdf

·       lec06-LogisticRegression.pdf

·       lec07-VectorSemantics_Word2vec.pdf

·       lec08-NN_NeuralLanguageModels.pdf

·       lec09-RNNs_LSTMs.pdf

·       lec10-Transformers_LLMs.pdf

·       lec11-BidirectionalTransformerEncoders.pdf

·       lec12-MorphologicalProcessing.pdf

·       lec13-1-SyntacticParsing.pdf

·       lec13-2-StatisticalParsing.pdf

 

 


Announcements:

·     We will use the HADI system ( https://hadi.hacettepe.edu.tr/login/ ) for all course announcements. All course materials, including your grades, will be available in the HADI system. You should regularly check the HADI system for the course announcements.

 

·     You will submit your assignments for AIN442 /BBM497 using the HADI system.

 

·     Midterm Date: April 9, Time: ??, Location: ??

o  Midterm will be a closed book exam.