We will soon make a following presentation:
RefEgo: Referring Expression Comprehension Dataset from First-Person Perception of Ego4D
Shuhei Kurita, Naoki Katsura, Eri Onami
Happy to see you at our poster!
"Foyer Sud" - 114 (2:30-4:30)
#ICCV2023
理研AIPの弊チームでは、一緒に研究を進めてくれる学生を探しています!修士課程進学予定の学部4年生、修士課程、博士課程に在籍の方でvision and language、3D and language、実世界グラウンディング、自然言語処理/理解などのキーワードに興味のある方は、ぜひ一緒に研究しましょう!
論文がCVPR2022に採択になりました:
ScanQA: 3D Question Answering for Spatial Scene Understanding
屋内の(再構築済み)3Dシーンで、質問応答と質問に関係するオブジェクトの特定を同時に行うタスクです。将来的には人間とロボットのインタラクション研究などに応用できるかも。
Paper accepted to CVPR2022!
ScanQA: 3D Question Answering for Spatial Scene Understanding
Daichi Azuma, Taiki Miyanishi, Shuhei Kurita, Motoki Kawanabe
Paper (arXiv):
(1/2)
#CVPR2022
ICLR2021 accepted! The new "generative" approach to vision and language navigation! We use the vision and action-conditioned language model to directly navigate in the virtual environment. Work with
@kchonyc
I have posted the latest manuscript for vision-and-language navigation (VLN) on arXiv with Kyunghyun Cho
@kchonyc
!
"Generative Language-Grounded Policy in Vision-and-Language Navigation with Bayes' Rule" (1/3)
昔から思っていることだけれど、2009年の地球シミュレータ2は全体で131TFLOPS。約189億円。
2017年のDGX Station (V100)は単精度なら公称480TFLOPS。数百万円。
何が言いたいかというと、いまGPT-3を学習できるのはかなり限られた機関だが、10年後には研究室レベルでもLLM学習ができるようになるぞ
I have posted the latest manuscript for vision-and-language navigation (VLN) on arXiv with Kyunghyun Cho
@kchonyc
!
"Generative Language-Grounded Policy in Vision-and-Language Navigation with Bayes' Rule" (1/3)
昔から思っていることだけれど、2009年の地球シミュレータ2は全体で131TFLOPS。約189億円。
2017年のDGX Station (V100)は単精度なら公称480TFLOPS。数百万円。
何が言いたいかというと、いまGPT-3を学習できるのはかなり限られた機関だが、10年後には研究室レベルでもLLM学習ができるようになるぞ
Paper accepted to CVPR2022!
ScanQA: 3D Question Answering for Spatial Scene Understanding
Daichi Azuma, Taiki Miyanishi, Shuhei Kurita, Motoki Kawanabe
Paper (arXiv):
(1/2)
#CVPR2022
Good news! Thank you for Prof. Kyunghyun Cho and many NYU people, I start my co-working with Cho at NYU from this January. If you are at NYU or NY, I'm so happy to work with you. Feel free to reach me!
Instruction tuning はGPT-3.5/4を使用してデータを集めるのが明らかに簡単で性能もよいらしいが(参考文献多数)、それで"use output from the Services to develop models that compete with OpenAI"してしまうと規約違反 なので、どこまで許されるのか、NCならいいのか問題
Can visually conditioned language models navigate in realistic virtual environments? We propose a language modeling-based approach for vision-and-language navigation. We will soon make the presentation at Spot A3 in
#ICLR2021
The code is available at
ICLR2021 accepted! The new "generative" approach to vision and language navigation! We use the vision and action-conditioned language model to directly navigate in the virtual environment. Work with
@kchonyc
Google seems to try suppressing LLM hallucination, but isn’t hallucination a part of essential aspects of intelligence? Imagine writing novels by human or playing of wild animals. They are parts of intelligence that aren't directly relevant to living skills real-world.
ACL Student Research Workshop 2023(
@acl_srw
)に
"How do different tokenizers perform on downstream tasks in scriptio continua languages?: A case study in Japanese"
が採択されました!🎉
The language information access technology (LIAT) team in RIKEN AIP has created a new introduction movie for SHINRA project 2021!
理研AIPの言語情報アクセス技術チームでは,森羅2021プロジェクトの新しい紹介ビデオを作りました!