|
Hello everyone, I'm the lead author of Text2CAD, and I’d like to start by expressing my gratitude for all the support and constructive criticism on our paper. As an AI researcher, it's essential to engage with diverse perspectives, as these discussions spark new ideas for improvement and growth. I would like to answer a few questions raised in the discussion. Final CAD Representation: Our CAD model representation is inspired by Onshape's Featurescript, a code-based CAD model generator. We opted for this since the available open-source datasets are from Onshape. The final CAD model is output in a sequence which can be transformed to .stl or .step format, using PythonOCC. These sequences are editable as it contains parameters for sketch and extrusion. In the future, as Text2CAD matures, we plan to support direct Onshape Featurescript generation as well. We're also planning to release a demo on HuggingFace soon, free for anyone to use. I would love to collaborate with the community for Onshape Featurescript generation, so if anyone is interested in contributing, feel free to reach out via email! LLM Usage: I want to clear up any confusion around how we use large language models (LLMs). Right now, LLMs aren’t involved in generating the final CAD model. Instead, we use them to help create the dataset (for example, text annotations). The actual CAD generation is powered by a Transformer-based architecture that works in a sequence-based approach (sketch-and-extrude). That said, integrating LLMs into the CAD generation process could very well become a reality in the future, and we’re keeping a close eye on that evolution. Simpler Shapes: Some people have pointed out that Text2CAD currently generates simpler shapes, and I agree. This is mainly due to the limitations of the datasets (not the method itself) we’re working with, which are restricted in complexity. We can currently only handle sketches with basic elements like lines, arcs, and circles, alongside extrusion operations. The creation of new CAD datasets is challenging due to IP restrictions, and as many of you know, great AI systems are built on strong datasets. So it will take time for open-source research in CAD to be fully usable for application. However, we’re hopeful that as more complex datasets become available, you’ll see far more advanced possibilities from Text2CAD and its successors. Theory vs Practical (What is Text2CAD's contribution ??): As an AI researcher, it’s crucial for us to balance theoretical contributions with practical applications. Translating geometric and parametric information from text into 3D models is a significant open problem in NLP. With Text2CAD, we’ve taken a step toward addressing this by providing a dataset that others can leverage for further research. I understand the sentiment that "a picture is worth a thousand words"—the future of CAD is undoubtedly multi-modal, much like GPT-4, where both language and images will coexist. For this to happen, AI agents must deeply understand both modalities. Ultimately, it comes down to personal preference: whether a user finds working with text or images more intuitive. This question is just as relevant for other text-to-3D projects as well. Assistance vs Replacement: In my website, I use the term “AI-Assisted CAD Modeling” to emphasize that the goal of our research is to support CAD designers, not replace them. The primary aim of Text2CAD is to streamline and speed up the prototyping process, rather than creating complete models from scratch. I recognize that not everyone may enjoy specifying numerous parameters, so we've also included the option to work with shape concepts without parameters. As with many tools, it comes down to personal preference, whether one prefers a more hands-on approach or quicker, conceptual modeling. Future Goal: Looking ahead, I’d love to hear from you —especially from CAD designers— about how you'd like to see AI assist you in your CAD designing process. Your opinions are important for AI researchers like myself, as they help us design tools that are truly useful and aligned with real-world needs. Reach out to me if you are interested. Cheers. |