Skip to content

DataScienceUIBK/TemporalQA-Survey

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

46 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Logo It's High Time: A Survey of Temporal Question Answering

Bhawna Piryani Β· Abdelrahman Abdallah Β· Jamshid Mozafari Β· Avishek Anand Β· Adam Jatowt
University of Innsbruck Β· TU Delft

πŸ“„ Read the Paper on arXiv Β |Β  πŸ—“οΈ 2025


πŸ“‹ Table of Contents


πŸ“˜ Overview

This repository provides a comprehensive, curated collection of research papers, datasets, methods, and resources focused on Temporal Question Answering (TQA) and Temporal Information Retrieval (Temporal IR). It accompanies our survey paper on how AI models reason about time, adapt to evolving knowledge, answer temporally constrained questions, and retrieve time-sensitive information.

Temporal QA Taxonomy


Key Contributions

✨ Comprehensive Survey: Coverage of 27+ datasets, 50+ methods spanning 2003-2025
πŸ“Š Unified Taxonomy: Systematic categorization of tasks, datasets, and approaches
πŸ” Critical Analysis: Evaluation of current capabilities and fundamental limitations
πŸš€ Research Roadmap: 7 critical directions for advancing temporal reasoning in AI

Why Temporal QA Matters

Time shapes how we:

  • πŸ—žοΈ Retrieve information: "Latest climate policies" vs. "policies from the 1990s"
  • 🧠 Reason about events: Understanding causality, change, and evolution
  • πŸ’¬ Interact with AI: Expecting contextually appropriate temporal grounding
  • πŸ”„ Adapt to change: Handling evolving facts and knowledge updates

πŸ“Š Datasets

Quick Statistics

  • 27+ TQA Datasets covering diverse domains and temporal scopes
  • 2.5M+ Questions spanning historical archives (1367) to real-time web (2025)
  • Dataset Categories: Diachronic, Synchronic, Web-based, Synthetic, KG-based

Featured Datasets

πŸ—žοΈ Diachronic Datasets (Time-Stamped Historical Documents)
Dataset Year #Questions Source Time Coverage Answer Type Links
ArchivalQA 2022 532K NYT Corpus 1987-2007 Extractive Paper Β· GitHub
ChroniclingAmericaQA 2024 485K Historical Newspapers 1800-1920 Extractive Paper Β· GitHub
StreamingQA 2022 147K News Articles 2007-2020 Extractive Paper Β· GitHub
NewsQA 2017 119K CNN/Daily Mail 2007-2015 Freeform Paper Β· GitHub
TempLAMA 2022 50K News 2010-2020 Extractive Paper Β· GitHub
TORQUE 2020 21K News - Abstractive Paper Β· GitHub
ForecastQA 2021 10.3K News 2015-2019 Multiple Choice Paper Β· Website
TDDiscourse 2019 6.1K News Unspecified Extractive Paper Β· GitHub
πŸ“– Synchronic Datasets (Wikipedia Snapshots)
Dataset Year #Questions Time Scope Answer Type Multi-Hop Links
ComplexTempQA 2024 100.2K 1987-2023 Extractive βœ“ Paper Β· GitHub
TEMPREASON 2023 52.8K 634-2023 Abstractive βœ— Paper Β· GitHub
TimeQA 2021 41.2K 1367-2018 Extractive βœ— Paper Β· GitHub
TemporalAlignmentQA 2024 20K 2000-2023 Abstractive βœ— Paper Github
SituatedQA 2021 12.2K ≀ 2021 Mixed βœ— Paper Β· GitHub
TempTabQA 2023 11.4K Infoboxes Abstractive βœ— Paper Β· Website
TiQ 2024 10K Unspecified Entities βœ— Paper Β· GitHub
PAT-Questions 2024 6.1K Present-anchored Extractive βœ“ Paper Β· GitHub
TRACIE 2021 5.4K ≀ 2020 Abstractive βœ— Paper Β· GitHub
MenatQA 2023 2.8K 1367-2018 Extractive βœ— Paper Β· GitHub
🌐 Web & Real-Time Datasets
Dataset Year #Questions Source Update Frequency Links
ReaLTimeQA 2023 5.1K Web Search Weekly (2020-2024) Paper Β· Website
FreshQA 2024 600 Google Search Periodic Paper Β· GitHub
πŸ§ͺ Synthetic & Reasoning-Focused Datasets
Dataset Year #Questions Focus Links
COTEMPQA 2024 4.7K Co-temporal reasoning Paper Β· GitHub
UnSeenTimeQA 2024 3.6K Beyond memorization Paper Β· GitHub
Test of Time (ToT) 2024 1.8K Temporal reasoning eval Paper Β· GitHub
TIMEDIAL 2021 1.1K Temporal commonsense Paper Β· GitHub

πŸ”§ Methods & Approaches

Evolution Timeline

πŸ“… 2003-2010: Rule-Based Era
   └─ TimeML, TERSEO, temporal taggers

πŸ“… 2011-2019: Statistical & Early Neural
   └─ Language models, temporal embeddings

πŸ“… 2020-2022: Transformer Revolution
   └─ Temporal pretraining, time-aware architectures

πŸ“… 2023-2025: LLM & RAG Era
   └─ Retrieval-augmented generation, temporal reasoning

Method Categories

πŸ€– Temporal Language Models (Click to expand all models)
Model Year Key Innovation Architecture Paper Code
TempoT5 2022 Temporal conditioning via prefixes T5 + timestamp prefixes Paper GitHub
BiTimeBERT 2023 Dual temporal encoding (timestamp + content) BERT + bi-temporal module Paper Github
TempoBERT 2022 Time-aware masking strategy BERT + temporal masking Paper GitHub
TALM 2023 Hierarchical temporal word representations BERT + temporal adapter Paper Github
SG-TLM 2023 Syntax-guided + temporal-aware masking BERT + dual masking Paper GitHub
TSM 2023 Temporal span masking T5 + salient span masking Paper Contact authors
Temporal Attention 2022 Time matrix in attention mechanism Transformer + time matrix Paper GitHub
TCQA 2023 Synthetic QA + span selection T5-based Paper Github
Time-aware Prompting 2022 Temporal prompts for generation GPT-2 + temporal prompts Paper GitHub
πŸ” Temporal RAG Systems (Click to expand all systems)
System Year Pipeline Architecture Temporal Signals Paper Code
TempRetriever 2025 Fusion-based dense retrieval Query + doc timestamps Paper Contact authors
TimeR4 2024 Retrieve-Rewrite-Retrieve-Rerank TKG timestamps + constraints Paper GitHub
MRAG 2024 Modular multi-hop framework Symbolic + semantic temporal scoring Paper Contact authors
TempRALM 2024 Dense retrieval + temporal proximity Timestamp-based ranking Paper Contact authors
TsContriever 2024 Contrastive time-sensitive retrieval Time-aware embeddings Paper Github
FreshLLMs 2024 Search augmentation for recency Web search integration Paper GitHub
🧠 Temporal Reasoning Methods (Click to expand all approaches)
Method Year Reasoning Type Key Contribution Paper Code
ECONET 2021 Continual adaptation Event consistency across updates Paper GitHub
ConTempo 2024 Contrastive temporal relations Unified temporal relation extraction Paper GitHub
TIMERS 2021 Document-level relations Structured inference layers Paper GitHub
TRAM 2024 Multi-dimensional reasoning Event frequency, duration, ordering Paper GitHub
TODAY 2023 Differential analysis Temporal robustness testing Paper GitHub
Narrative-of-Thought 2024 Narrative-based reasoning Recounted narratives for coherence Paper GitHub
πŸ“œ Classical Methods (Rule-Based & Statistical)
Era Methods Key Papers
Rule-Based TimeML, TERSEO, temporal taggers Harabagiu & Bejan, 2005, Saquete et al., 2004, Saquete et al., 2004
Statistical IR Time-based language models, temporal ranking Li & Croft, 2003, Berberich et al., 2010, Arikan et al., 2009, Alonso et al., 2007, , ,

πŸ“š Complete historical overview β†’


πŸ“– Temporal Tasks

Core temporal prediction tasks supporting TQA systems:

Task Input Output Key Applications Representative Papers
Event Dating Event description Event timestamp Historical analysis, timeline construction Das et al., 2017, Wang et al., 2021
Document Dating Document text Creation date Digital preservation, metadata recovery Kumar et al., 2012, Niculae et al. 2014, Vashishth et al. 2018, Jatowt et al. 2007, SalahEldeen and Nelson, 2013
Focus Time Estimation Document content Discussed time period Historical QA, event-centric retrieval Jatowt et al., 2013, Jatowt et al., 2013, Shrivastava et al., 2017
Query Time Profiling Search query Temporal intent/distribution Time-aware search, query understanding Kanhabua & NΓΈrvΓ₯g, 2010,Jones and Diaz 2007 Dakka et al., 2008, Gupta and Berberich 2014

πŸ₯ Domain-Specific Applications

Medical Domain

Challenges: Patient timeline reconstruction, symptom progression, treatment sequencing

System/Dataset Focus Key Paper
TimeText Time-oriented clinical QA Zhou et al., 2008
Temporal Clinical QA Semantic web techniques Tao et al., 2010
Time-aware Health QA Evidence retrieval with recency Vladika & Matthes, 2024

Legal Domain

Challenges: Evolving statutes, precedent timelines, jurisdiction-specific temporal expressions

System/Dataset Focus Key Paper
ChronosLex Time-aware incremental training T.y.s.s et al., 2024

Financial Domain

Challenges: Regulatory changes, market events, time-sensitive numerical reasoning

Dataset Focus Key Paper
FinQA Numerical reasoning over financial data Chen et al., 2021
FinTextQA Long-form financial QA Chen et al., 2024
FinDER Financial QA with RAG Choi et al., 2025

πŸ› οΈ Resources & Tools

Temporal Taggers & NLP Tools

Tool Year Languages Type Features Link
HeidelTime 2010 200+ Rule-based High precision, domain adaptation Paper Β· GitHub
SUTime 2012 English Rule-based Stanford CoreNLP integration Paper Β· Website
CogCompTime 2018 English Neural Compositional temporal understanding Paper Β· GitHub
Temponym Tagger 2016 English Hybrid Implicit temporal references Paper

Document Collections

Collection Period Size Domain Access
NYT Annotated Corpus 1987-2007 1.8M articles News LDC License
Chronicling America 1800-1920 Historical Newspapers Free Access
Newswire Corpus 1878-1977 2.7M articles News HuggingFace
Wikipedia Dumps Various TB-scale Encyclopedia Wikimedia

Evaluation Frameworks


πŸš€ Future Directions

Our survey identifies 7 critical research areas requiring immediate attention:

1️⃣ Dynamic Temporal Knowledge Management

Problem: Static corpora can't handle evolving facts
Challenge: Temporal propagation when updating related events
Needed: Real-time knowledge graphs with dependency tracking

2️⃣ Temporally-Aware LLM Agents

Problem: LLMs hallucinate temporal information
Challenge: Resolving "last Tuesday" or "since our last chat"
Needed: Timeline memory, temporal reference resolution

3️⃣ Diachronic-Synchronic Integration

Problem: Most systems use only one knowledge type
Challenge: Aligning historical trends with current snapshots
Needed: Cross-source temporal alignment algorithms

4️⃣ Temporal Uncertainty & Confidence

Problem: Systems treat all dates as exact
Challenge: "Around 476 AD", "mid-20th century"
Needed: Probabilistic temporal representations

5️⃣ Multilingual & Multimodal TQA

Problem: Most work is English text-only
Challenge: Lunar calendars, visual time cues, cultural references
Needed: Cross-lingual temporal taggers, vision-language models

6️⃣ Implicit Temporal Intent Understanding

Problem: Many questions hide their time constraints
Challenge: Inferring "now" vs. "historically" from context
Needed: Context-dependent temporal intent detection

7️⃣ Evaluation & Benchmarking

Problem: Standard metrics don't capture temporal coherence
Challenge: Measuring temporal grounding, not just accuracy
Needed: Temporal-aware evaluation protocols


✨ Citation

If you find this work useful, please cite πŸ“œour paper:

Plain

Piryani, B., Abdullah, A., Mozafari, J., Anand, A., & Jatowt, A. (2025). It's High Time: A Survey of Temporal Question Answering. arXiv preprint arXiv:2505.20243.

Bibtex

@article{piryani2025s,
  title={It's High Time: A Survey of Temporal Question Answering},
  author={Piryani, Bhawna and Abdullah, Abdelrahman and Mozafari, Jamshid and Anand, Avishek and Jatowt, Adam},
  journal={arXiv preprint arXiv:2505.20243},
  year={2025}
}

πŸͺͺLicense

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ“ Contributing

We welcome contributions to keep this survey comprehensive and up-to-date!

Missing a Paper or Dataset?

If we've missed your work or you know of a relevant paper/dataset that should be included, please send us an email at:

πŸ“§ bhawna.piryani@uibk.ac.at

Please include:

  • Paper title and authors
  • Link to paper and code/data (if available)
  • Brief description of the contribution

You can also open an issue on GitHub.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published