Los mejores podcasts de Arx (2024)

1
[QA] Understanding LLMs Requires More Than Statistical Generalization 10:10

19h ago10:10

10:10

The paper discusses the non-identifiability of large language models (LLMs) and its implications on generalization, highlighting the need for a new theoretical perspective. https://arxiv.org/abs//2405.01964 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podcas…

1
Understanding LLMs Requires More Than Statistical Generalization 19:25

19h ago19:25

19:25

The paper discusses the non-identifiability of large language models (LLMs) and its implications on generalization, highlighting the need for a new theoretical perspective. https://arxiv.org/abs//2405.01964 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podcas…

1
[QA] Mitigating LLM Hallucinations via Conformal Abstention 8:28

19h ago8:28

8:28

Developing a method for large language models to abstain from providing incorrect answers, using self-consistency and conformal prediction to reduce hallucination rates. https://arxiv.org/abs//2405.01563 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podcast/a…

1
Mitigating LLM Hallucinations via Conformal Abstention 19:11

19h ago19:11

19:11

Developing a method for large language models to abstain from providing incorrect answers, using self-consistency and conformal prediction to reduce hallucination rates. https://arxiv.org/abs//2405.01563 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podcast/a…

1
[QA] Structural Pruning of Pre-trained Language Models via Neural Architecture Search 8:56

20h ago8:56

8:56

Paper explores using neural architecture search (NAS) for structural pruning of pre-trained language models to optimize efficiency and generalization performance, utilizing two-stage weight-sharing NAS for accelerated search. https://arxiv.org/abs//2405.02267 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers…

1
Structural Pruning of Pre-trained Language Models via Neural Architecture Search 13:25

20h ago13:25

13:25

Paper explores using neural architecture search (NAS) for structural pruning of pre-trained language models to optimize efficiency and generalization performance, utilizing two-stage weight-sharing NAS for accelerated search. https://arxiv.org/abs//2405.02267 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers…

1
[QA] Capabilities of Gemini Models in Medicine 13:07

2d ago13:07

13:07

The paper explores the impact of climate change on global food security, highlighting the need for sustainable agricultural practices to mitigate future risks. https://arxiv.org/abs//2404.18416 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-paper…

1
Capabilities of Gemini Models in Medicine 35:57

2d ago35:57

35:57

The paper explores the impact of climate change on global food security, highlighting the need for sustainable agricultural practices to mitigate future risks. https://arxiv.org/abs//2404.18416 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-paper…

1
[QA] StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation 9:55

2d ago9:55

9:55

The paper introduces Consistent Self-Attention and Semantic Motion Predictor to enhance content consistency in diffusion-based generative models for text-to-image and video generation, enabling rich visual story creation. https://arxiv.org/abs//2405.01434 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers App…

1
StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation 15:05

2d ago15:05

15:05

The paper introduces Consistent Self-Attention and Semantic Motion Predictor to enhance content consistency in diffusion-based generative models for text-to-image and video generation, enabling rich visual story creation. https://arxiv.org/abs//2405.01434 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers App…

1
[QA] In-Context Learning with Long-Context Models: An In-Depth Exploration 9:43

2d ago9:43

9:43

The paper explores in-context learning (ICL) at extreme scales, showing performance improvements with hundreds or thousands of demonstrations, contrasting with example retrieval and finetuning. https://arxiv.org/abs//2405.00200 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcast…

1
In-Context Learning with Long-Context Models: An In-Depth Exploration 13:01

2d ago13:01

13:01

The paper explores in-context learning (ICL) at extreme scales, showing performance improvements with hundreds or thousands of demonstrations, contrasting with example retrieval and finetuning. https://arxiv.org/abs//2405.00200 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcast…

1
[QA] WILDCHAT: 1M ChatGPT Interaction Logs in the Wild 11:01

3d ago11:01

11:01

WILDCHAT is a diverse dataset of 1 million user-ChatGPT conversations, offering rich insights into chatbot interactions and potential toxic use-cases for researchers. https://arxiv.org/abs//2405.01470 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podcast/arxi…

1
WILDCHAT: 1M ChatGPT Interaction Logs in the Wild 13:11

3d ago13:11

13:11

WILDCHAT is a diverse dataset of 1 million user-ChatGPT conversations, offering rich insights into chatbot interactions and potential toxic use-cases for researchers. https://arxiv.org/abs//2405.01470 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podcast/arxi…

1
[QA] NeMo-Aligner: Scalable Toolkit for Efficient Model Alignment 10:22

3d ago10:22

10:22

NeMo-Aligner is a scalable toolkit for aligning Large Language Models with human values, supporting various alignment paradigms and designed for extensibility. https://arxiv.org/abs//2405.01481 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-paper…

1
NeMo-Aligner: Scalable Toolkit for Efficient Model Alignment 11:39

3d ago11:39

11:39

NeMo-Aligner is a scalable toolkit for aligning Large Language Models with human values, supporting various alignment paradigms and designed for extensibility. https://arxiv.org/abs//2405.01481 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-paper…

1
[QA] PROMETHEUS 2: An Open Source Language Model Specialized in Evaluating Other Language Models 7:57

3d ago7:57

7:57

Prometheus 2 is an open-source LM designed for evaluating responses, outperforming existing models in correlation with human and proprietary LM judgments. https://arxiv.org/abs//2405.01535 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1…

1
PROMETHEUS 2: An Open Source Language Model Specialized in Evaluating Other Language Models 14:09

3d ago14:09

14:09

Prometheus 2 is an open-source LM designed for evaluating responses, outperforming existing models in correlation with human and proprietary LM judgments. https://arxiv.org/abs//2405.01535 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1…

1
[QA] A Careful Examination of Large Language Model Performance on Grade School Arithmetic 7:39

4d ago7:39

7:39

Study investigates dataset contamination in large language models for mathematical reasoning using Grade School Math 1000 benchmark, finding evidence of overfitting and potential memorization of benchmark questions. https://arxiv.org/abs//2405.00332 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Pod…

1
A Careful Examination of Large Language Model Performance on Grade School Arithmetic 16:08

4d ago16:08

16:08

Study investigates dataset contamination in large language models for mathematical reasoning using Grade School Math 1000 benchmark, finding evidence of overfitting and potential memorization of benchmark questions. https://arxiv.org/abs//2405.00332 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Pod…

1
[QA] Self-Play Preference Optimization for Language Model Alignment 9:47

5d ago9:47

9:47

The paper introduces SPPO, a self-play method for language model alignment, achieving state-of-the-art results without external supervision, outperforming DPO and IPO on various benchmarks. https://arxiv.org/abs//2405.00675 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.ap…

1
Self-Play Preference Optimization for Language Model Alignment 19:41

5d ago19:41

19:41

The paper introduces SPPO, a self-play method for language model alignment, achieving state-of-the-art results without external supervision, outperforming DPO and IPO on various benchmarks. https://arxiv.org/abs//2405.00675 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.ap…

1
[QA] Is Bigger Edit Batch Size Always Better? - An Empirical Study on Model Editing with Llama-3 9:14

5d ago9:14

9:14

Study evaluates model editing techniques on Llama-3, finding sequential editing more effective than batch editing. Suggests combining both methods for optimal performance. https://arxiv.org/abs//2405.00664 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podcast…

1
Is Bigger Edit Batch Size Always Better? - An Empirical Study on Model Editing with Llama-3 5:46

5d ago5:46

5:46

Study evaluates model editing techniques on Llama-3, finding sequential editing more effective than batch editing. Suggests combining both methods for optimal performance. https://arxiv.org/abs//2405.00664 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podcast…

1
[QA] Iterative Reasoning Preference Optimization 8:28

5d ago8:28

8:28

Iterative preference optimization method enhances reasoning tasks by optimizing preference between generated Chain-of-Thought candidates, leading to improved accuracy on various datasets without additional sourcing. https://arxiv.org/abs//2404.19733 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Pod…

1
Iterative Reasoning Preference Optimization 14:43

5d ago14:43

14:43

Iterative preference optimization method enhances reasoning tasks by optimizing preference between generated Chain-of-Thought candidates, leading to improved accuracy on various datasets without additional sourcing. https://arxiv.org/abs//2404.19733 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Pod…

1
[QA] Harmonic LLMs are Trustworthy 9:11

6d ago9:11

9:11

https://arxiv.org/abs//2404.19708 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016 Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers --- Support this podcast: https://podcasters.spotify.com/pod/show/arxiv-papers/supp…

1
Harmonic LLMs are Trustworthy 12:46

6d ago12:46

12:46

https://arxiv.org/abs//2404.19708 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016 Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers --- Support this podcast: https://podcasters.spotify.com/pod/show/arxiv-papers/supp…

1
[QA] Better & Faster Large Language Models via Multi-token Prediction 7:20

6d ago7:20

7:20

Training language models to predict multiple future tokens at once improves sample efficiency, downstream capabilities, and inference speed without increasing training time, especially beneficial for larger models and generative tasks. https://arxiv.org/abs//2404.19737 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@ar…

1
Better & Faster Large Language Models via Multi-token Prediction 16:32

6d ago16:32

16:32

Training language models to predict multiple future tokens at once improves sample efficiency, downstream capabilities, and inference speed without increasing training time, especially beneficial for larger models and generative tasks. https://arxiv.org/abs//2404.19737 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@ar…

1
[QA] Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models 8:25

7d ago8:25

8:25

https://arxiv.org/abs//2404.18796 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016 Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers --- Support this podcast: https://podcasters.spotify.com/pod/show/arxiv-papers/supp…

1
Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models 7:55

7d ago7:55

7:55

https://arxiv.org/abs//2404.18796 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016 Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers --- Support this podcast: https://podcasters.spotify.com/pod/show/arxiv-papers/supp…

1
[QA] Stylus: Automatic Adapter Selection for Diffusion Models 10:22

7d ago10:22

10:22

The paper introduces Stylus, a method for efficiently selecting and composing task-specific adapters based on prompts' keywords, achieving high-quality image generation with improved efficiency and performance gains. https://arxiv.org/abs//2404.18928 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Po…

1
Stylus: Automatic Adapter Selection for Diffusion Models 16:46

7d ago16:46

16:46

The paper introduces Stylus, a method for efficiently selecting and composing task-specific adapters based on prompts' keywords, achieving high-quality image generation with improved efficiency and performance gains. https://arxiv.org/abs//2404.18928 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Po…

1
[QA] DPO Meets PPO: Reinforced Token Optimization for RLHF 9:04

7d ago9:04

9:04

Introducing Reinforced Token Optimization (RTO) framework for Reinforcement Learning from Human Feedback (RLHF) using Markov decision process (MDP) to improve token-wise reward learning and policy optimization. https://arxiv.org/abs//2404.18922 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts…

1
DPO Meets PPO: Reinforced Token Optimization for RLHF 24:05

7d ago24:05

24:05

Introducing Reinforced Token Optimization (RTO) framework for Reinforcement Learning from Human Feedback (RLHF) using Markov decision process (MDP) to improve token-wise reward learning and policy optimization. https://arxiv.org/abs//2404.18922 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts…

1
[QA] Kangaroo: Lossless Self-Speculative Decoding via Double Early Exiting 7:03

7d ago7:03

7:03

Kangaroo introduces a self-speculative decoding framework for accelerating large language model inference, using a shallow sub-network and early exiting mechanisms to improve efficiency. https://arxiv.org/abs//2404.18911 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple…

1
Kangaroo: Lossless Self-Speculative Decoding via Double Early Exiting 6:46

7d ago6:46

6:46

Kangaroo introduces a self-speculative decoding framework for accelerating large language model inference, using a shallow sub-network and early exiting mechanisms to improve efficiency. https://arxiv.org/abs//2404.18911 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple…

1
[QA] AdvPrompter: Fast Adaptive Adversarial Prompting for LLMs 10:31

8d ago10:31

10:31

https://arxiv.org/abs//2404.16873 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016 Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers --- Support this podcast: https://podcasters.spotify.com/pod/show/arxiv-papers/supp…

1
AdvPrompter: Fast Adaptive Adversarial Prompting for LLMs 28:11

8d ago28:11

28:11

https://arxiv.org/abs//2404.16873 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016 Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers --- Support this podcast: https://podcasters.spotify.com/pod/show/arxiv-papers/supp…

1
[QA] Talking Nonsense: Probing Large Language Models' Understanding of Adversarial Gibberish Inputs 9:55

8d ago9:55

9:55

Study explores if large language models understand their own language. Greedy Coordinate Gradient optimizer crafts prompts to compel coherent responses from nonsensical inputs, revealing efficiency and robustness differences. https://arxiv.org/abs//2404.17120 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers…

1
Talking Nonsense: Probing Large Language Models' Understanding of Adversarial Gibberish Inputs 14:17

8d ago14:17

14:17

Study explores if large language models understand their own language. Greedy Coordinate Gradient optimizer crafts prompts to compel coherent responses from nonsensical inputs, revealing efficiency and robustness differences. https://arxiv.org/abs//2404.17120 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers…

1
[QA] Let’s Think Dot by Dot: Hidden Computation in Transformer Language Models 8:03

9d ago8:03

8:03

Transformers can use meaningless filler tokens to solve tasks, but learning to use them is challenging. Additional tokens can provide computational benefits independently of token choice. https://arxiv.org/abs//2404.15758 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.appl…

1
Let’s Think Dot by Dot: Hidden Computation in Transformer Language Models 14:06

9d ago14:06

14:06

Transformers can use meaningless filler tokens to solve tasks, but learning to use them is challenging. Additional tokens can provide computational benefits independently of token choice. https://arxiv.org/abs//2404.15758 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.appl…

1
[QA] Retrieval Head Mechanistically Explains Long-Context Factuality 10:27

9d ago10:27

10:27

This paper investigates how transformer-based language models retrieve information from long contexts, identifying special attention heads called retrieval heads as crucial for this task. https://arxiv.org/abs//2404.15574 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.appl…

1
Retrieval Head Mechanistically Explains Long-Context Factuality 9:01

9d ago9:01

9:01

This paper investigates how transformer-based language models retrieve information from long contexts, identifying special attention heads called retrieval heads as crucial for this task. https://arxiv.org/abs//2404.15574 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.appl…

1
[QA] AUTOCRAWLER : A Progressive Understanding Web Agent for Web Crawler Generation 12:05

9d ago12:05

12:05

AUTOCRAWLER combines large language models with crawlers to efficiently handle diverse web environments, improving adaptability and scalability compared to traditional methods. https://arxiv.org/abs//2404.12753 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/po…

1
AUTOCRAWLER : A Progressive Understanding Web Agent for Web Crawler Generation 12:14

9d ago12:14

12:14

AUTOCRAWLER combines large language models with crawlers to efficiently handle diverse web environments, improving adaptability and scalability compared to traditional methods. https://arxiv.org/abs//2404.12753 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/po…

1
[QA] Revisiting Text-to-Image Evaluation with Gecko: On Metrics, Prompts, and Human Ratings 10:13

10d ago10:13

10:13

https://arxiv.org/abs//2404.16820 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016 Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers --- Support this podcast: https://podcasters.spotify.com/pod/show/arxiv-papers/supp…

1
Revisiting Text-to-Image Evaluation with Gecko: On Metrics, Prompts, and Human Ratings 18:20

10d ago18:20

18:20

https://arxiv.org/abs//2404.16820 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016 Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers --- Support this podcast: https://podcasters.spotify.com/pod/show/arxiv-papers/supp…

Podcasts que vale la pena escuchar

Podcasts de Arx

Podcasts que vale la pena escuchar

Guia de referencia rapida