Datasets Articles | AI-SCHOLAR.TECH | AI-SCHOLAR | AI: (Artificial Intelligence) Articles and technical information media

Potential Of The Conversation Optimization Tokenizer: A Method To Improve LLM Inference Efficiency By 10%

Potential Of The Conversation Optimization Tokenizer: A Method To Improve LLM Inference Efficiency B ...

30/07/2025

Democratizing GPT-4o Level Image Generation: The Janus-4o And ShareGPT-4o-Image Challenge

24/07/2025

Ultra-Sparse Memory Network: A New Method To Change Transformer Memory Efficiency

23/06/2025

SportQA, A New Dataset That Measures The Comprehension Of Sports In Large Language Models

30/01/2025 Large Language Models

Proposal For A New Evaluation Method For AI Assistants Based On Human Preferences

29/01/2025 Large Language Models

New UrbanSARFloods Dataset Solves Flood Detection Challenges

15/01/2025 Datasets

Persona Hub, A Large Dataset Built From 1 Billion Personas, Is Now Available!

19/12/2024 Persona-driven Data Synthesis

InfiMM-WebMath-40B] Improves The Mathematical Performance Of LLM With A Dataset Consisting Of 2.4 Billion Mathematical Documents!

InfiMM-WebMath-40B] Improves The Mathematical Performance Of LLM With A Dataset Consisting Of 2.4 Bi ...

30/10/2024 Datasets

IndiBias, A New Dataset For Measuring India-specific Social Biases

16/08/2024 Large Language Models

[EDAT24] Event-based Dataset Specialized For Manufacturing Operation Classification

05/08/2024 Datasets

[JMMLU] Prompt Politeness Affects LLM Performance!

26/07/2024 ChatGPT

Analog And Multimodal Manufacturing Data Sets Acquired On The Future Factory Platform

30/05/2024 Datasets

OpenToM, A Benchmark For Evaluating Whether An LLM Has A "theory Of Mind," Is Now Available!

24/05/2024 Datasets

BioPlanner" And "BIOPROT Dataset" Automate Experimental Protocols For Biological Research

24/05/2024 Large Language Models

Investigation Of A Method To Continuously Authenticate Users With Mouse Movements

20/05/2024 Machine Learning

Datasets

Potential Of The Conversation Optimization Tokenizer: A Method To Improve LLM Inference Efficiency By 10%

Potential Of The Conversation Optimization Tokenizer: A Method To Improve LLM Inference Efficiency B ...

Democratizing GPT-4o Level Image Generation: The Janus-4o And ShareGPT-4o-Image Challenge

Democratizing GPT-4o Level Image Generation: The Janus-4o And ShareGPT-4o-Image Challenge

Ultra-Sparse Memory Network: A New Method To Change Transformer Memory Efficiency

Ultra-Sparse Memory Network: A New Method To Change Transformer Memory Efficiency

Hymba, A New Architecture That Pushes The Limits Of Small LLMs

Hymba, A New Architecture That Pushes The Limits Of Small LLMs

LLM To Create Training Data For Domain Generalization

LLM To Create Training Data For Domain Generalization

Construction And Analysis Of The "TruthEval" Dataset To Expose LLM Weaknesses

Construction And Analysis Of The "TruthEval" Dataset To Expose LLM Weaknesses

SportQA, A New Dataset That Measures The Comprehension Of Sports In Large Language Models

SportQA, A New Dataset That Measures The Comprehension Of Sports In Large Language Models

Proposal For A New Evaluation Method For AI Assistants Based On Human Preferences

Proposal For A New Evaluation Method For AI Assistants Based On Human Preferences

New UrbanSARFloods Dataset Solves Flood Detection Challenges

New UrbanSARFloods Dataset Solves Flood Detection Challenges

Persona Hub, A Large Dataset Built From 1 Billion Personas, Is Now Available!

Persona Hub, A Large Dataset Built From 1 Billion Personas, Is Now Available!

InfiMM-WebMath-40B] Improves The Mathematical Performance Of LLM With A Dataset Consisting Of 2.4 Billion Mathematical Documents!

InfiMM-WebMath-40B] Improves The Mathematical Performance Of LLM With A Dataset Consisting Of 2.4 Bi ...

IndiBias, A New Dataset For Measuring India-specific Social Biases

IndiBias, A New Dataset For Measuring India-specific Social Biases

[EDAT24] Event-based Dataset Specialized For Manufacturing Operation Classification

[EDAT24] Event-based Dataset Specialized For Manufacturing Operation Classification

[JMMLU] Prompt Politeness Affects LLM Performance!

[JMMLU] Prompt Politeness Affects LLM Performance!

Analog And Multimodal Manufacturing Data Sets Acquired On The Future Factory Platform

Analog And Multimodal Manufacturing Data Sets Acquired On The Future Factory Platform

OpenToM, A Benchmark For Evaluating Whether An LLM Has A "theory Of Mind," Is Now Available!

OpenToM, A Benchmark For Evaluating Whether An LLM Has A "theory Of Mind," Is Now Available!

BioPlanner" And "BIOPROT Dataset" Automate Experimental Protocols For Biological Research

BioPlanner" And "BIOPROT Dataset" Automate Experimental Protocols For Biological Research

Investigation Of A Method To Continuously Authenticate Users With Mouse Movements

Investigation Of A Method To Continuously Authenticate Users With Mouse Movements