アメリカのカーネギーメロン大学でコンピューターサイエンスの博士課程を始めます!言語技術を軸にロボットを征服します🤖
Officially starting a PhD program at CMU
@LTIatCMU
I’m super excited to be a part of this great community!
#PhDlife
#cmu
#博士 #カーネギーメロン
I’ll be attending CVPR 2024 in Seattle, USA!
I’m looking forward to networking with peers, collaborators, and leaders in computer vision to share ideas and explore potential collaboration!
CVPR 2024参加します!よろしくお願いします🙇
Henny Admoni, Zeynep Temel and Yonatan Bisk all work with robots, but in very different ways. On the new episode of Does Compute they talk everything from human-robot interaction to how robots can help us connect abstract thought to real-world situations.
I will be presenting OpenFusion on May 15 at Semantic Scene Understanding 2 Session from 13:30-15:00 at CC502!!
#ICRA2024
Open-Fusion: Real-time Open-Vocabulary 3D Mapping and Queryable Scene Representation
Preprint:
Code:
"a humanoid robot in the street, finds a cardboard box on the ground and bends down to pick it up with both hands."
I don't think this can be used for inverse kinematics just yet, but I can see it getting there! awesome work
@LumaLabsAI
📢 Exciting news! We're hosting the Learning Across Multiple Modalities (LAMM) Workshop at
#ACCV2024
!
📝 Don't miss out—submit your papers by September 11, 2024.
Join us in advancing research across modalities! 🌐
🔗
#MachineLearning
#ComputerVision
The LTI is holding a virtual info-session for prospective MLT and PhD applicants on Wednesday, November 6 from 12pm -1pm, ET! Our faculty will be answering any questions about the application process. Register at the following link to attend:
📢If you're applying to
@SCSatCMU
PhD and come from a traditionally underrepresented group, current students are here to help!
Apply to the Graduate Applicant Support Program (GASP) by Oct. 8, 2024 to receive feedback on your application materials:
The Meiji government translated 10,000 technical books (applied science, industry, etc).
Then, Japan became an industrial powerhouse.
Must be the greatest industrial policy investment ever made!
Once again: the importance of upper human capital.
h/t 🔒
If you are planning to apply for CS PhD program this year, you should definitely sign up for PAMS!
You’ll be matched with one CSE PhD student, have zoom meetings with them & get lots feedback on your SOP.
I mentored some via this program, one of whom is now at UW CSE PhD 😎
I’ll be attending ICRA 2024 in Yokohama Japan!
I’m looking forward to networking with peers, collaborators, and leaders in robotics to share ideas and explore potential collaboration!
ICRA 2024参加します!よろしくお願いします🙇
🎉 Exciting news! A paper by one of my mentees
@taisei_hanyu
has been published in the prestigious Q1 journal, Remote Sensing! 🌍📡 So proud of their hard work and dedication. Well deserved!
Paper:
CMU SCS is also running a graduate application support!
Apply to the Graduate Applicant Support Program (GASP) by Oct. 8, 2024 to receive feedback on your application materials:
Applying to Stanford's CS PhD program? Current graduate students are running a SoP + CV feedback program for URM applicants (broadly defined). Apply to SASP by Oct. 25! Info:
Salesforce presents xGen-MM (BLIP-3)
A Family of Open Large Multimodal Models
discuss:
This report introduces xGen-MM (also known as BLIP-3), a framework for developing Large Multimodal Models (LMMs). The framework comprises meticulously curated
I just released last year's lecture of my speech recognition course. Thanks, Shuichiro and Kashu, for editing!
Fall2023-SpeechRecognition&Understanding via
@YouTube
🤔 Can we train one policy to control a wide range of robots, from drones to quadrupeds, navigators to bimanual manipulators, and more?
🦾Introducing CrossFormer: a single policy that can perform manipulation, navigation, aviation, and locomotion:
I will be presenting OpenFusion on May 15 at Semantic Scene Understanding 2 Session from 13:30-15:00 at CC502!!
#ICRA2024
Open-Fusion: Real-time Open-Vocabulary 3D Mapping and Queryable Scene Representation
Preprint:
Code:
Google Scholar Metrics for 2024 are released. CVPR (
#2
), NeurIPS (
#7
), and ICLR (
#10
) are in the top 10 (ranked by h5-index). Up from
#4
,
#9
,
#10
last year.
Open-Fusion: build queryable, open-vocabulary 3d maps in real time. This is cool progress towards what is, in my opinion, one of the most important problems in robotics right now.
- As the robot explores, it takes in a stream of RGB-D images
- extract open-vocabulary features
OpenFusion: Real-time Open-Vocabulary 3D Mapping and Queryable Scene Representation
TL;DR: Open-Fusion builds an open-vocabulary 3D queryable scene from a sequence of posed RGB-D images in real-time.
I will be presenting the CyberDemo paper this afternoon from 5:15pm to 6:45pm at poster 324, in area 4A-E
@CVPR
. Feel free to stop by and ask me anything about the paper.