AIN440 Introduction to Natural Language Processing

BBM495 Introduction to Natural Language Processing

This course is given together with “AIN442 Practicum in Natural Language Processing” / “BBM497 Introduction to Natural Language Processing Laboratory” complementary practicum (laboratory) course.


Semester        2026 Spring
Instructor      :  Ilyas Cicekli  

Email              :  ilyas@cs.hacettepe.edu.tr
Class Hours   :  Wednesday 9:30-12:15       (AIN440)

                           Wednesday 13:30-16:15      (BBM495)

                           Classroom: Seminar Hall 


AIN442 /BBM497:

            Class Hours               :  Wednesday 16:30-18:30

   Classroom: Seminar Hall 

Research Assistant   :  İsmail Furkan Atasoy & Sümeyye Meryem Taşyürek

Email                          :  ismailfurkanatasoy@cs.hacettepe.edu.tr

   meryemtasyurek@cs.hacettepe.edu.tr


Text Book

1.   Daniel Jurafsky, and James H. Martin, "Speech and Language Processing", Third Edition, Prentice Hall, 2024. 

 

Reference Books

  1. Christopher D. Manning, and Hinrich Schutze, "Foundations of Statistical Natural Language Processing", The MIT Press, 1999.

3.     Bird, Steven, Edward Loper and Ewan Klein, “Natural Language Processing with Python, O’Reilly Media Inc., 2009.


Grading for AIN440/BBM495

Midterm 1 : 30% 

Midterm 2 : 30% 

Final Exam : 40% 

 

Attendance Policy:

·     Regular attendance is expected. Attendance will be taken during class hours, and more than one attendance may be recorded on the same day.

 


 

Grading for AIN442/BBM497:

Programming Assignments : 100%          

 

Programming Assignment Policy:

·     There will be at least four programming assignments.

·     Late Policy:

o  You must submit your programming assignments before their due dates.

o  You may submit your assignments up to three days late, but with a penalty. A 10% penalty will be applied for each day late (Penalties: 1 day late: 10%, 2 days late: 20%, 3 days late: 30%)

 


Tentative Course Outline:

 

Week

Subject

Related chapters in 3rd edition of textbook

1

Introduction/Overview of NLP

Ch. 1

2

Regular Expressions, Text Normalization, Edit Distance

Ch. 2

3

N-gram Language Models,

Ch. 3

4

Spelling Correction, Part-of-Speech Tagging

Ch. 8 & Appendix B

5

Text Classification: Naive Bayes

Ch. 4  & Appendix B

6

Text Classification: Logistic Regression

Ch. 4

7

Vector Semantics

Ch. 5

 

MIDTERM 1

 

8

Neural Networks and Neural Language Models

Ch. 6

9

RNNs and LSTMs

Ch. 13

10

Transformers and Large Language Models

Ch. 7 &8

11

Fine-Tuning and Masked Language Models

Ch. 10

 

MIDTERM 2

 

12

Morphological Processing

Ch. 3 from 2nd edition of the book

13

Context-Free Grammars and Syntactic Parsing

Ch. 17 and else

14

Statistical Parsing

Ch. 18 and else

 

FINAL EXAM

 

           

 


Lecture Notes:

·       lec01-introduction.pdf

·       lec02-1-BasicTextProcessing.pdf

·       lec02-2-MinimumEditDistance.pdf

·       lec03-LanguageModels.pdf

·       lec04-1-SpellingCorrection.pdf

·       lec04-2-PartOfSpeechTagging.pdf

·       lec05-TextClassificationNaiveBayes.pdf

·       lec06-LogisticRegression.pdf

·       lec07-VectorSemantics_Word2vec.pdf

·       lec08-NN_NeuralLanguageModels.pdf

·       lec09-RNNs_LSTMs.pdf

·       lec10-Transformers_LLMs.pdf

·       lec11-BidirectionalTransformerEncoders.pdf

·       lec12-MorphologicalProcessing.pdf

·       lec13-1-SyntacticParsing.pdf

·       lec13-2-StatisticalParsing.pdf

 

 


Announcements:

·     We will use the HADI system ( https://hadi.hacettepe.edu.tr/login/ ) for all course announcements. All course materials, including your grades, will be available in the HADI system. You should regularly check the HADI system for the course announcements.

 

·     You will submit your assignments for AIN442 /BBM497 using the HADI system.

 

 

·     Midterm 1 Date: April 1, Time: 16:30, Location: ??

 

·     Midterm 2 Date: May 6, Time: 16:30, Location: ??

 

o  Midterms will be closed book exams.