NeurIPS · 2017

Attention Is All You Need

Introduces Transformer architecture based entirely on attention. Eliminates recurrence and convolution for sequence transduction. Enables significantly higher parallelization during training…

AI Summary
LOG IN · 2023

Replace standard sum or mean aggregators with softmax aggregation in GNNs to maintain high-resolution distinctions between similar latent values. This is essential for algorithmic tasks where precision in value compariso…

AI Summary
arXiv.org · 2024

Avoid treating US-centric fairness definitions as universal standards. When deploying global AI systems, researchers must adapt fairness metrics to local cultural and social contexts to ensure interventions are actually …

AI Summary
arXiv.org · 2024

Replace standard fully-connected layers with EUGens to achieve up to 27% faster inference and 30% memory savings. This is ideal for deploying large Transformers or MLPs on edge devices or real-time systems.…

AI Summary
arXiv.org · 2024

Enables the use of neural networks in safety-critical computational tasks by providing formal proofs of correctness. Use this framework when 'mostly correct' is insufficient and absolute mathematical certainty is require…

AI Summary
arXiv.org · 2025

Deploy evolutionary LLM agents for discrete optimization tasks where traditional heuristics plateau. This method broke a 56-year-old record in matrix multiplication, proving its utility for high-stakes mathematical and a…

AI Summary
arXiv.org · 2025

Formal separation of unintended memorization from generalization. Novel method to estimate total model capacity. Scaling laws for capacity and membership inference…

AI Summary
arXiv.org · 2025

Generative evaluation system using frontier video model Veo. Framework for OOD generalization and safety red teaming. Action-conditioned, multi-view consistent simulation for robotics…

AI Summary
arXiv.org · 2025

Standardized version of Meta-World benchmark. Disambiguation of inconsistent results in literature. Insights into multi-task RL benchmark design…

AI Summary
arXiv.org · 2025

Adapts Diffusion Transformers for 3D molecular conformer generation. Modular architecture separating 3D coordinates from graph connectivity. Two graph-based conditioning strategies for varying molecular structures…

AI Summary
Neural Information Processing Systems · 2023

Prioritize tuning the regularization term in SSL objectives like VICReg or Barlow Twins to improve semantic clustering. This specific component is the primary driver for translating pretext tasks into useful downstream c…

AI Summary
Trans. Mach. Learn. Res. · 2023

Use blockwise training to bypass memory bottlenecks in large-scale models. By training layers independently, you can fit significantly larger architectures on hardware with limited VRAM without storing full computational…

AI Summary

Never miss a paper that matters

Subscribe to the researchers you follow and get AI summaries delivered to your inbox daily.

Start free — no credit card