+
PDF
Chat
|
GPT-4o System Card
|
2024
|
OpenAI
NULL AUTHOR_ID
A. M. Hurst
Adam Lerer
Adam P. Goucher
Adam Perelman
Aditya Ramesh
Aidan Clark
AJ Ostrow
Akila Welihinda
|
+
PDF
Chat
|
Modeling Multi-hop Question Answering as Single Sequence Prediction
|
2022
|
Semih Yavuz
Kazuma Hashimoto
Yingbo Zhou
Nitish Shirish Keskar
Caiming Xiong
|
+
|
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models
|
2022
|
Aarohi Srivastava
Abhinav Rastogi
Abhishek S. Rao
Abu Awal Shoeb
Abubakar Abid
Adam Fisch
Adam R. Brown
Adam Santoro
Aditya Gupta
Adrià Garriga-Alonso
|
+
|
Generating Negative Samples for Sequential Recommendation
|
2022
|
Yongjun Chen
Jia Li
Zhiwei Liu
Nitish Shirish Keskar
Huan Wang
Julian McAuley
Caiming Xiong
|
+
|
Modeling Multi-hop Question Answering as Single Sequence Prediction
|
2022
|
Semih Yavuz
Kazuma Hashimoto
Yingbo Zhou
Nitish Shirish Keskar
Caiming Xiong
|
+
|
GeDi: Generative Discriminator Guided Sequence Generation
|
2021
|
Ben Krause
Akhilesh Gotmare
Bryan McCann
Nitish Shirish Keskar
Shafiq Joty
Richard Socher
Nazneen Fatema Rajani
|
+
|
GeDi: Generative Discriminator Guided Sequence Generation
|
2021
|
Ben Krause
Akhilesh Gotmare
Bryan McCann
Nitish Shirish Keskar
Shafiq Joty
Richard Socher
Nazneen Fatema Rajani
|
+
|
Char2Subword: Extending the Subword Embedding Space Using Robust Character Compositionality
|
2021
|
Gustavo Aguilar
Bryan McCann
Tong Niu
Nazneen Fatema Rajani
Nitish Shirish Keskar
Thamar Solorio
|
+
PDF
Chat
|
Unsupervised Paraphrasing with Pretrained Language Models
|
2021
|
Tong Niu
Semih Yavuz
Yingbo Zhou
Nitish Shirish Keskar
Huan Wang
Caiming Xiong
|
+
|
Combining Data-driven Supervision with Human-in-the-loop Feedback for Entity Resolution
|
2021
|
Wenpeng Yin
Shelby Heinecke
Jia Li
Nitish Shirish Keskar
Michael S. Jones
Shouzhong Shi
Stanislav Georgiev
Kurt Milich
J. A. Esposito
Caiming Xiong
|
+
|
Char2Subword: Extending the Subword Embedding Space from Pre-trained Models Using Robust Character Compositionality.
|
2020
|
Gustavo Aguilar
Bryan McCann
Tong Niu
Nazneen Fatema Rajani
Nitish Shirish Keskar
Thamar Solorio
|
+
|
Unsupervised Paraphrase Generation via Dynamic Blocking.
|
2020
|
Tong Niu
Semih Yavuz
Yingbo Zhou
Huan Wang
Nitish Shirish Keskar
Caiming Xiong
|
+
PDF
Chat
|
Unsupervised Paraphrasing with Pretrained Language Models
|
2020
|
Tong Niu
Semih Yavuz
Yingbo Zhou
Nitish Shirish Keskar
Huan Wang
Caiming Xiong
|
+
PDF
Chat
|
GeDi: Generative Discriminator Guided Sequence Generation
|
2020
|
Ben Krause
Akhilesh Gotmare
Bryan McCann
Nitish Shirish Keskar
Shafiq Joty
Richard Socher
Nazneen Fatema Rajani
|
+
|
Mirostat: A Perplexity-Controlled Neural Text Decoding Algorithm
|
2020
|
Sourya Basu
Govardana Sachitanandam Ramachandran
Nitish Shirish Keskar
Lav R. Varshney
|
+
|
ProGen: Language Modeling for Protein Generation
|
2020
|
Ali Madani
Bryan McCann
Nikhil Naik
Nitish Shirish Keskar
Namrata Anand
Raphael R. Eguchi
Po‐Ssu Huang
Richard Socher
|
+
|
ProGen: Language Modeling for Protein Generation
|
2020
|
Ali Madani
Bryan McCann
Nikhil Naik
Nitish Shirish Keskar
Namrata Anand
Raphael R. Eguchi
Po‐Ssu Huang
Richard Socher
|
+
|
Limits of Detecting Text Generated by Large-Scale Language Models.
|
2020
|
Lav R. Varshney
Nitish Shirish Keskar
Richard Socher
|
+
PDF
Chat
|
Limits of Detecting Text Generated by Large-Scale Language Models
|
2020
|
Lav R. Varshney
Nitish Shirish Keskar
Richard Socher
|
+
|
Improving out-of-distribution generalization via multi-task self-supervised pretraining
|
2020
|
Isabela Albuquerque
Nikhil Naik
Junnan Li
Nitish Shirish Keskar
Richard Socher
|
+
|
Mirostat: A Neural Text Decoding Algorithm that Directly Controls Perplexity
|
2020
|
Sourya Basu
Govardana Sachitanandam Ramachandran
Nitish Shirish Keskar
Lav R. Varshney
|
+
|
Unsupervised Paraphrasing with Pretrained Language Models
|
2020
|
Tong Niu
Semih Yavuz
Yingbo Zhou
Nitish Shirish Keskar
Huan Wang
Caiming Xiong
|
+
|
GeDi: Generative Discriminator Guided Sequence Generation
|
2020
|
Ben Krause
Akhilesh Gotmare
Bryan McCann
Nitish Shirish Keskar
Shafiq Joty
Richard Socher
Nazneen Fatema Rajani
|
+
|
ProGen: Language Modeling for Protein Generation
|
2020
|
Ali Madani
Bryan McCann
Nikhil Naik
Nitish Shirish Keskar
Namrata Anand
Raphael R. Eguchi
Po‐Ssu Huang
Richard Socher
|
+
|
Limits of Detecting Text Generated by Large-Scale Language Models
|
2020
|
Lav R. Varshney
Nitish Shirish Keskar
Richard Socher
|
+
|
Char2Subword: Extending the Subword Embedding Space Using Robust Character Compositionality
|
2020
|
Gustavo Aguilar
Bryan McCann
Tong Niu
Nazneen Fatema Rajani
Nitish Shirish Keskar
Thamar Solorio
|
+
|
Global Capacity Measures for Deep ReLU Networks via Path Sampling
|
2019
|
Ryan Theisen
Jason M. Klusowski
Huan Wang
Nitish Shirish Keskar
Caiming Xiong
Richard Socher
|
+
|
Neural Text Summarization: A Critical Evaluation
|
2019
|
Wojciech Kryściński
Nitish Shirish Keskar
Bryan McCann
Caiming Xiong
Richard Socher
|
+
|
Unifying Question Answering and Text Classification via Span Extraction.
|
2019
|
Nitish Shirish Keskar
Bryan McCann
Caiming Xiong
Richard Socher
|
+
|
Coarse-grain Fine-grain Coattention Network for Multi-evidence Question Answering
|
2019
|
Victor W. Zhong
Caiming Xiong
Nitish Shirish Keskar
Richard Socher
|
+
|
XLDA: Cross-Lingual Data Augmentation for Natural Language Inference and Question Answering
|
2019
|
Jasdeep Singh
Bryan McCann
Nitish Shirish Keskar
Caiming Xiong
Richard Socher
|
+
|
Neural Text Summarization: A Critical Evaluation
|
2019
|
Wojciech Kryściński
Nitish Shirish Keskar
Bryan McCann
Caiming Xiong
Richard Socher
|
+
|
Pretrained AI Models: Performativity, Mobility, and Change
|
2019
|
Lav R. Varshney
Nitish Shirish Keskar
Richard Socher
|
+
|
CTRL: A Conditional Transformer Language Model for Controllable Generation
|
2019
|
Nitish Shirish Keskar
Bryan McCann
Lav R. Varshney
Caiming Xiong
Richard Socher
|
+
|
Unifying Question Answering, Text Classification, and Regression via Span Extraction
|
2019
|
Nitish Shirish Keskar
Bryan McCann
Caiming Xiong
Richard Socher
|
+
|
Global Capacity Measures for Deep ReLU Networks via Path Sampling
|
2019
|
Ryan Theisen
Jason M. Klusowski
Huan Wang
Nitish Shirish Keskar
Caiming Xiong
Richard Socher
|
+
|
Neural Text Summarization: A Critical Evaluation
|
2019
|
Wojciech Kryściński
Nitish Shirish Keskar
Bryan McCann
Caiming Xiong
Richard Socher
|
+
|
Balancing Communication and Computation in Distributed Optimization
|
2018
|
Albert S. Berahas
Raghu Bollapragada
Nitish Shirish Keskar
Ermin Wei
|
+
|
A Closer Look at Deep Learning Heuristics: Learning rate restarts, Warmup and Distillation
|
2018
|
Akhilesh Gotmare
Nitish Shirish Keskar
Caiming Xiong
Richard Socher
|
+
PDF
Chat
|
A Closer Look at Deep Learning Heuristics: Learning rate restarts,
Warmup and Distillation
|
2018
|
Akhilesh Gotmare
Nitish Shirish Keskar
Caiming Xiong
Richard Socher
|
+
|
A Closer Look at Deep Learning Heuristics: Learning rate restarts, Warmup and Distillation.
|
2018
|
Akhilesh Gotmare
Nitish Shirish Keskar
Caiming Xiong
Richard Socher
|
+
|
The Natural Language Decathlon: Multitask Learning as Question Answering
|
2018
|
Bryan McCann
Nitish Shirish Keskar
Caiming Xiong
Richard Socher
|
+
|
Using Mode Connectivity for Loss Landscape Analysis.
|
2018
|
Akhilesh Gotmare
Nitish Shirish Keskar
Caiming Xiong
Richard Socher
|
+
|
An Analysis of Neural Language Modeling at Multiple Scales
|
2018
|
Stephen Merity
Nitish Shirish Keskar
Richard Socher
|
+
|
Identifying Generalization Properties in Neural Networks
|
2018
|
Huan Wang
Nitish Shirish Keskar
Caiming Xiong
Richard Socher
|
+
|
The Natural Language Decathlon: Multitask Learning as Question Answering
|
2018
|
Bryan McCann
Nitish Shirish Keskar
Caiming Xiong
Richard Socher
|
+
|
Using Mode Connectivity for Loss Landscape Analysis
|
2018
|
Akhilesh Gotmare
Nitish Shirish Keskar
Caiming Xiong
Richard Socher
|
+
|
A Closer Look at Deep Learning Heuristics: Learning rate restarts, Warmup and Distillation
|
2018
|
Akhilesh Gotmare
Nitish Shirish Keskar
Caiming Xiong
Richard Socher
|
+
PDF
Chat
|
A limited-memory quasi-Newton algorithm for bound-constrained non-smooth optimization
|
2017
|
Nitish Shirish Keskar
Andreas Wächter
|
+
|
Regularizing and Optimizing LSTM Language Models
|
2017
|
Stephen Merity
Nitish Shirish Keskar
Richard Socher
|
+
|
Regularizing and Optimizing LSTM Language Models
|
2017
|
Stephen Merity
Nitish Shirish Keskar
Richard Socher
|
+
|
Balancing Communication and Computation in Distributed Optimization
|
2017
|
Albert S. Berahas
Raghu Bollapragada
Nitish Shirish Keskar
Ermin Wei
|
+
|
Weighted Transformer Network for Machine Translation
|
2017
|
Karim Ahmed
Nitish Shirish Keskar
Richard Socher
|
+
|
Improving Generalization Performance by Switching from Adam to SGD
|
2017
|
Nitish Shirish Keskar
Richard Socher
|
+
|
Regularizing and Optimizing LSTM Language Models
|
2017
|
Stephen Merity
Nitish Shirish Keskar
Richard Socher
|
+
|
A Limited-Memory Quasi-Newton Algorithm for Bound-Constrained Nonsmooth Optimization
|
2016
|
Nitish Shirish Keskar
Andreas Wäechter
|
+
|
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
|
2016
|
Nitish Shirish Keskar
Dheevatsa Mudigere
Jorge Nocedal
Mikhail Smelyanskiy
Ping Tang
|
+
|
A second-order method for convex<sub>1</sub>-regularized optimization with active-set prediction
|
2016
|
Nitish Shirish Keskar
Jorge Nocedal
Figen Öztoprak
Andreas Wächter
|
+
PDF
Chat
|
adaQN: An Adaptive Quasi-Newton Algorithm for Training RNNs
|
2016
|
Nitish Shirish Keskar
Albert S. Berahas
|
+
|
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
|
2016
|
Nitish Shirish Keskar
Dheevatsa Mudigere
Jorge Nocedal
Mikhail Smelyanskiy
Ping Tang
|
+
|
A Limited-Memory Quasi-Newton Algorithm for Bound-Constrained Nonsmooth Optimization
|
2016
|
Nitish Shirish Keskar
Andreas Wäechter
|
+
|
A Second-Order Method for Convex $\ell_1$-Regularized Optimization with Active Set Prediction
|
2015
|
Nitish Shirish Keskar
Jorge Nocedal
Figen Öztoprak
Andreas Wäechter
|
+
|
A Second-Order Method for Convex $\ell_1$-Regularized Optimization with Active Set Prediction
|
2015
|
Nitish Shirish Keskar
Jorge Nocedal
Figen Öztoprak
Andreas Wäechter
|
+
|
adaQN: An Adaptive Quasi-Newton Algorithm for Training RNNs
|
2015
|
Nitish Shirish Keskar
Albert S. Berahas
|