Erkut Erdem

Professor

Computer Vision Laboratory (HUCVL)
Dept. of Computer Engineering
Hacettepe University

Office: 112
Address: Beytepe Campus, Ankara, Turkey TR-06800
e-mail: erkut at cs dot hacettepe dot edu dot tr
Phone: +90 312 297 7500 / 122
Fax: +90 (312) 297 7502

My research centers on the areas of computer vision and machine learning. I believe the right algorithms and representations are the ones that take into account the contextual influences. Thus, the research objective that my students and I pursue is to incorporate different kinds of context (spatial, temporal and/or cross-modal) into all levels of visual processing from low to intermediate and high-level vision.

Current research interests: Visual Saliency Prediction, Automatic Image Description, Video/Photoset Summarization, Image Filtering, Image Editing

Hacettepe University Faculty Member 2010-now
Ecole Nationale Supérieure des Télécommunications Post-doctoral Researcher 2009-2010
Middle East Technical University 1997-2008 Ph.D., 2008 M.Sc., 2003 B.Sc., 2001
University of California Los Angeles Visiting Researcher Oct. 2007 - Dec. 2007
Virginia Bioinformatics Institute, Virginia Tech Visiting Researcher Jul. 2004 - Aug. 2004

News and highlights

So excited to announce that I am co-affiliated with Koç University and İş Bank Artificial Intelligence Center (KUIS AI) as a research fellow.

I'm looking for motivated MSc/PhD students. Scholarships are available, please send me your CV if interested!

[September 2024]: Our work on diffusion-based object removal from images accepted to NeurIPS 2024.

[September 2024]: Our work on a GAN-based unified framework for domain adaptation, image synthesis and manipulation accepted to SIGGRAPH Asia 2024.

[March 2024]: Our work on sequential compositional generalization in multimodal models accepted to NAACL 2024.

[February 2024]: Our work on video synthesis from events will be published in IEEE Transactions on Image Processing.

[January 2024]: Our work on evaluating zero-shot linguistic and temporal understanding capabilities of video-language models accepted to ICLR 2024.

[October 2023]: Our paper exploring the relationship between training dynamics and compositional generalization has been accepted for EMNLP Findings 2023.

[September 2023]: Our paper on hyperspectral image denoising has been accepted for publication in Signal Processing journal.

[August 2023]: Our work on omnidirectional video saliency prediction got accepted to BMVC 2023.

[July 2023]: Our work on text-guided image manipulation will be published in ACM Transactions on Graphics. We will be presenting our work at SIGGRAPH Asia 2023 at Sydney.

[June 2023]: I received a gift fund from Adobe Research. With Duygu Ceylan of Adobe Research and Aykut Erdem of KUIS AI LAB, we will develop novel methods for text-guided image synthesis and editing. Thanks Adobe!.

[May 2023]: We will be organizing the on Multimodal, Multilingual Natural Language Generation and Multilingual WebNLG Challenge at INLG/SIGDial 2023 in September 2023

[August 2023]: BIG-bench paper has been accepted for publication in Transactions on Machine Learning Research.

[February 2023]: Our work on omnidirectional image quality assessment got accepted to ICASSP 2023.

[December 2022]: I am honored to be recognized as one of the 2022 Outstanding Associate Editors of IEEE Transactions on Multimedia.

[November 2022]: We ranked 2nd in the euphemism detection shared task organized by the Figurative Language Processing workshop at EMNLP 2022.

[September 2022]: Our work on language-guided video manipulation accepted to BMVC 2022.

[June 2022]: Our work on language-guided image analysis got the best paper award at 5th Multimodal Learning and Applications Workshop.

[May 2022]: We will be organizing a training school on Representation Mediated Multimodality at Schloss Etelsen, Germany in September 26-30. 2022

[April 2022]: Our work on language-guided image analysis got accepted to 5th Multimodal Learning and Applications Workshop.

[February 2022]: Our survey paper on neural natural language generation has been accepted for publication in Journal of Artificial Intelligence Research.

[February 2022]: Our work on causal reasoning got accepted to Findings of ACL 2022.

[January 2022]: I will be teaching the undergraduate-level course: BBM444 Fundamentals of Computational Photography.

[January 2022]: Our work on query-specific video summarization has been accepted for publication in Multimedia Tools and Application.

[December 2021]: Excited to share that our project on event-based vision under extremely low-light conditions will be funded by TUBITAK-1001 program. With Aykut Erdem, we will explore hybrid approaches to bring traditional and event cameras together to solve crucial challenges we face when processing dark videos.

[December 2021]: I received a gift fund from Adobe Research. With Duygu Ceylan of Adobe Research, and Aykut Erdem and Deniz Yuret of KUIS AI LAB, we will develop novel methods for semantic image editing. Thanks Adobe!.

[October 2021]: Our work on low-light image enhancement will be published in IEEE Transactions on Image Processing.

[August 2021]: I was appointed to an Associate Editor of IEEE Transactions on Multimedia (T-MM).

[July 2021]: Our work on stochastic video prediction got accepted to ICCV 2021.

[July 2021]: Our work on Turkish video captioning has been published in the Machine Translation journal

[June 2021]: Our work on dynamic saliency prediction will be published in IEEE Transactions on Cognitive and Developmental Systems.

[April 2021]: Our joint work with HUCGLab on the use of synthetic data to analyze the performance of trackers under adverse weather conditions has been accepted for publication in Signal Processing: Image Communication.

[May 2021]: Our collaborative work with HUCGLab on joint person re-identification and attribute recognition has been accepted for publication in Image and Vision Computing.

[May 2021]: Our joint work with HUCGLab on procedural generation of person videos has been accepted for publication in Computer Graphics Forum.

[April 2021]: Our collaborative work with HUCGLab on procedural generation of person videos has been accepted for publication in Computer Graphics Forum.

[February 2021]: Our joint work with ICON lab at UMRAM, Bilkent University on multi-contrast MRI synthesis has been accepted for publication in Medical Image Analysis.

[February 2021]: I will be teaching the undergraduate-level course: BBM406 Fundamentals of Machine Learning.

[February 2021]: Our work on dense video captioning has been accepted for publication in Pattern Recognition Letters.

[January 2021]: Our work on learning visually-grounded cross-lingual representations got accepted to EACL 2021.

[October 2020]: Our work on visual story graphs has been accepted for publication in Signal Processing: Image Communication.

[May 2020]: Our ACM TOG paper on manipulating transient attributes of natural scenes was featured on Two Minute Papers.

[January 2020]: Our joint work with the Cognition, Learning and Robotics (CoLoRs) lab at Bogazici University on reasoning about action effects on articulated multi-part objects has been accepted to ICRA 2020.

[October 2019]: Our work on manipulating transient attributes of natural scenes via hallucination has been accepted for publication in ACM Transactions on Graphics.

[September 2019]: Our work about reasoning on procedural data is accepted to CoNLL 2019: "Procedural Reasoning Networks for Understanding Multimodal Procedures".

[April 2019]: I will give a tutorial on "Multimodal Learning with Vision and Language" together with Aykut Erdem at IPTA 2019.

[February 2019]: Our joint work with ICON lab at UMRAM, Bilkent University on multi-contrast MRI synthesis with GANs has been accepted for publication in IEEE Transactions on Medical Imaging.

[December 2018]: I will give a talk on Integrated Vision and Language at ITURO 2019.

[December 2018]: I have received The Young Researcher Award given by Turkish Academy of Sciences.

[August 2018]: Our work on multimodal machine comprehension is accepted to EMNLP 2018: "RecipeQA: A Challenge Dataset for Multimodal Comprehension of Cooking Recipes". Read our paper, download the data, and submit your predictions at our project website.

Curriculum Vitae (as of March 2024)

Projects

As Principal Investigator

Seeing Through Events: End-to-End Approaches to Event-Based Vision Under Extremely Low-Light Conditions

Project Duration: 3 years (2022-2025)
Sponsors: TUBITAK 1001 - Support Program for Scientific and Technological Research Projects (Award# 121E454)
project page

A Multimodal and Multilingual Framework for Video Captioning

Project Duration: 2 years (2018-2020)
Sponsors: TUBITAK and British Council - Newton-Katip Çelebi Fund Institutional Links Grant Programme (Award# 217E054)
project page

Using Synthetic Data for Deep Person Re-Identification

Project Duration: 2 years (2018-2020)
Sponsors: TUBITAK 1001 - Support Program for Scientific and Technological Research Projects (Award# 217E029)
project page

Understanding Images and Visualizing Text: Semantic Inference and Retrieval by Integrating Computer Vision and Natural Language Processing

Project Duration: 3 years (2014-2017)
Sponsors: TUBITAK 1001 - Support Program for Scientific and Technological Research Projects (Award# 113E116) and European Union under European Cooperation in Science and Technology (COST) Programme (ICT COST IC1037 Action)
project page

The Use of Multiple Cues and Contextual Knowledge in Computer Vision

Project Duration: 3 years (2012-2015)
Sponsors: TUBITAK 3501 - Career Development Program (Award# 112E146)
project page

As Co-Investigator

Seeing the Invisible: End-to-End Approaches for Hyperspectral Image Enhancement and Synthesis

Project Duration: 30 months (2023-2026)
Sponsors: TUBITAK 1001 - Support Program for Scientific and Technological Research Projects (Award# 123E385)
project page

Quality Assessment of 360-Degree Videos Guided by Audio-Visual Saliency

Project Duration: 3 years (2021-2024)
Sponsors: TUBITAK 1001 - Support Program for Scientific and Technological Research Projects (Award# 120E501)
project page

Summarization Approaches Towards Interpreting Big Visual Data

Project Duration: 3 years (2017-2020)
Sponsors: TUBITAK 1003 - Primary Subjects R&D Funding Program (Award# 116E685)
project page

City-Wide Video Surveillance System

Project Duration: 3 years (2016-2019)
Sponsors: TUBITAK 1007 - Public Institutions Research Funding Program (Award# 114G028)
project page

Selected Publications

ViLMA: A Zero-Shot Benchmark for Linguistic and Temporal Grounding in Video-Language Models,
I. Kesen, A. Pedrotti, M. Dogan, M. Cafagna, E. C. Acikgoz, L. Parcalabescu, I. Calixto, A. Frank, A. Gatt, A. Erdem, E. ErdemThe International Conference on Learning Representations (ICLR 2024),, Vienna, Austria, May 2024
:: pdf
:: project page
VidStyleODE: Disentangled Video Editing via StyleGAN and NeuralODEs,
M. H. Ali, A. Bond, T. Birdal, D. Ceylan, L. Karacan, E. Erdem, A. ErdemIEEE International Conference on Computer Vision (ICCV 2023),, Paris, France, October 2023
:: pdf
:: project page
CLIP-Guided StyleGAN Inversion for Text-Driven Real Image Editing,
A. C. Baykal, A. B. Anees, D. Ceylan, E. Erdem, A. Erdem, D. Yuret, ACM Transactions on Graphics, in press
:: pdf
:: project page
CRAFT: A Benchmark for Causal Reasoning About Forces and inTeractions,
T. Ates, M. S. Atesoglu, C. Yigit, I. Kesen, M. Kobas, E. Erdem, A. Erdem, T. Goksun, D. Yuret, Findings of ACL 2022
:: pdf
:: project page
Burst Photography for Learning to Enhance Extremely Dark Images,
A. S. Karadeniz, E. Erdem, A. Erdem, IEEE Transactions on Image Processing, Vol. 30, pp. 9372-9385, 2021
:: pdf
:: project page
A Gated Fusion Network for Dynamic Saliency Prediction,
A. Kocak, E. Erdem, A. Erdem, IEEE Transactions on Cognitive and Developmental Systems, accepted for publication
:: pdf
:: project page
mustGAN: multi-stream Generative Adversarial Networks for MR Image Synthesis,
M. Yurt, S. UH Dar, A. Erdem, Erkut Erdem, K. K. Oguz, T. Cukur, Medical Image Analysis, Vol. 70, May 2021
:: pdf
Cross-lingual Visual Pre-training for Multimodal Machine Translation,
O. Caglayan, M. Kuyu, M. S. Amac, P. Madhyastha, E. Erdem, A. Erdem, L. Specia, The 16th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2021)
:: pdf
:: project page (with data and code)
Belief Regulated Dual Propagation Nets for Learning Action Effects on Articulated Multi-Part Objects,
A. E. Tekden, A. Erdem, E. Erdem, M. Imre, M. Y. Seker, and E. Ugur, International Conference on Robotics and Automation (ICRA) 2020, Paris, France, May-June 2020
:: pdf
:: project page
:: video
Manipulating Attributes of Natural Scenes via Hallucination,
L. Karacan, Z. Akata, A. Erdem and E. Erdem, ACM Transactions on Graphics, accepted for publication, 2019
:: pdf
:: project page (with code)
Procedural Reasoning Networks for Understanding Multimodal Procedures,
M.S. Amac, S. Yagcioglu, A. Erdem, and E. Erdem, The SIGNLL Conference on Computational Natural Language Learning (CoNLL), Hong Kong, November 2019
:: pdf
:: project page (with code)
Image Synthesis in Multi-Contrast MRI with Conditional Generative Adversarial Networks,
S. U. H. Dar, M. Yurt, L. Karacan, A. Erdem, E. Erdem, T. Cukur, IEEE Transactions on Medical Imaging, Vol. 38, No.10, pp. 2375-2388, October 2019
:: pdf
RecipeQA: A Challenge Dataset for Multimodal Comprehension of Cooking Recipes,
S. Yagcioglu, A. Erdem, E. Erdem, and N. Ikizler-Cinbis, Conference on Empirical Methods in Natural Language Processing (EMNLP) 2018, Brussels, Belgium, October-November 2018
:: pdf
:: project page (with data and leaderboard)
Spatio-Temporal Saliency Networks for Dynamic Saliency Prediction,
C. Bak, A. Kocak, E. Erdem and A. Erdem, IEEE Transactions on Multimedia, 20(7). pp. 1688-1698, July 2018.
:: pdf
Image Synthesis in Multi-Contrast MRI with Conditional Generative Adversarial Networks,
S. U. H. Dar, M. Yurt, L. Karacan, A. Erdem, E. Erdem, T. Cukur, arXiv preprint arXiv:1802.01221, February 2018
:: pdf
Alpha Matting with KL-Divergence Based Sparse Sampling,
L. Karacan, A. Erdem and E. Erdem, IEEE Transactions on Image Processing, 26(9), pp. 4523-4536, September 2017
:: pdf
Re-evaluating Automatic Metrics for Image Captioning,
M. Kilickaya, A. Erdem, N. Ikizler-Cinbis and E. Erdem, The 15th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2017), Valencia, Spain, April 2017
:: pdf
An Objective Deghosting Quality Metric for HDR Images,
O. T. Tursun, A. O. Akyuz, A. Erdem and E. Erdem, Computer Graphics Forum (Eurographics 2016), 35(2), pp. 139-152, May 2016
:: project page (with code)
:: pdf
Automatic Description Generation from Images: A Survey of Models, Datasets, and Evaluation Measures,
R. Bernardi, R. Cakici, D. Elliott, A. Erdem, E. Erdem, N. Ikizler-Cinbis, F. Keller, A. Muscat, B. Plank, Journal of Artificial Intelligence Research, 55, pp. 409-442, February 2016
:: pdf
Image Matting with KL-Divergence Based Sparse Sampling,
L. Karacan, A. Erdem and E. Erdem, IEEE International Conference on Computer Vision (ICCV 2015),, Santiago, Chile, December 2015
:: project page
:: pdf
The State of the Art in HDR Deghosting: A Survey and Evaluation,
O. T. Tursun, A. O. Akyuz, A. Erdem and E. Erdem, Computer Graphics Forum (Eurographics State-of-the-art Report (STAR) 2015), 34(2), pp. 683-707, May 2015
:: pdf
Structure Preserving Image Smoothing via Region Covariances,
L. Karacan, E. Erdem and A. Erdem, ACM Transactions on Graphics (Proceedings of SIGGRAPH Asia 2013), Vol. 32, No. 6, November 2013
:: project page (with code)
:: pdf
Visual saliency estimation by nonlinearly integrating features using region covariances,
E. Erdem and A. Erdem, Journal of Vision, Vol. 13, No. 4, pp. 1-20, March 2013
:: project page (with code)
:: pdf

For a full list of publications, please see the Publications page.

Erkut Erdem

News and highlights

Projects

As Principal Investigator

Seeing Through Events: End-to-End Approaches to Event-Based Vision Under Extremely Low-Light Conditions

A Multimodal and Multilingual Framework for Video Captioning

Using Synthetic Data for Deep Person Re-Identification

Understanding Images and Visualizing Text: Semantic Inference and Retrieval by Integrating Computer Vision and Natural Language Processing

The Use of Multiple Cues and Contextual Knowledge in Computer Vision

As Co-Investigator

Seeing the Invisible: End-to-End Approaches for Hyperspectral Image Enhancement and Synthesis

Quality Assessment of 360-Degree Videos Guided by Audio-Visual Saliency

Summarization Approaches Towards Interpreting Big Visual Data

City-Wide Video Surveillance System

Selected Publications

Students

Ph.D. Students

M.Sc. Students

Former Students

Teaching

Current

Undergraduate

Past

Undergraduate

Graduate

image courtesy: iwdrm