This Will Become Outdated
First of all, AI and Machine Learning is a rapidly advancing technology, and its capabilities and uses are still in the ‘exploratory’ phase of adoption. See the date above? If you read this 6 months from February 2023, expect that anything outside of the Overview section is likely to be outdated.
Pulse on the Tech Industry
If you’ve been keeping tabs on the internet technology sector in 2022, you’ve noticed that the cryptocurrency craze has crashed (once again), social media is more a dumpster fire than before, a number of layoffs are occurring, and the new darling technology that Venture Capital firms are into is AI Generated content. ChatGPT has been making waves the past two months, with people who aren’t tech enthusiasts getting their first taste on the awesome power of machine learning.
AI technology in general is just starting to make its strides into the AEC Industry, mostly through its software vendors (Autodesk, Bluebeam, Adobe, Trimble, etc.), startups, and some experimentation in larger companies.
I expect this year that will change and AI will become a tool that many more of us in the AEC industry will use in some professional capacity going forward.
Brief Overview of Machine Learning AI
Machine Learning (ML) is a type of AI that is trained on a curated data set for a specific purpose. The AI is designed to use one or more algorithms (linear regression, time series, decision trees, etc.) to “learn” inferences & relationships from the training data.
All of those learned inferences become part of the Machine Learning Model, which is used as the tool for fulfilling the designed purpose. That purpose can be anything from predicting outcomes, image/video recognition, and more recently, generating content.
The Important Caveats of ML
Here are the caveats of modern machine learning AI:
ML Models are very complex. This is especially true if it is a Deep Learning Model, which uses multiple algorithms in multiple layers like a neural network.
Current ML Models do not rely on pre-written equations or established facts. For example, if a model is trained to observe building demolition footage, it might infer that gravity is a force that accelerates the fall speed of objects, but it would not know the Newtonian physics equation F = G(m1 x m2)/r2; thus if you ask it to predict the falling speed on the moon, it would most likely give wrong answers.
A flawed training data set makes a flawed model. If you heard the phrase “garbage-in, garbage-out,” ML puts that principle on hyperdrive, especially when it comes to bias. You likely have seen multiple examples of this.
Natural Language Models (ChatGPT et. al.)
As of late 2022, Natural Language Processing (NLP) Models are currently getting the most attention, with OpenAI’s ChatGPT in the spotlight. NLP is not new in computer science; the goal of making a machine understand human language has been worked on for over 60 years. The latest chain of breakthroughs since 2013 have changed the pace of advancement.
The modern NLP models, which are sometimes called “Large Language Models” (LLM) are Deep Learning models trained on vast amounts of written works – from poetry, novels, non-fiction books, blogposts, to academic articles and programming source code. During the training, the machine learning AI builds a vastly complex framework of logic on the relationships of words (using methods such as word vectors, transformer networks, etc.).
After it’s trained, the LLM acts as the engine for the AI to generate written responses to any query a user gives.
LLM are not all-knowing answer engines
This newest iteration of Natural Language AI is getting very accurate at understanding the meaning of language… but it does not understand facts of the world.
Recall the caveat mentioned that ML does not rely on pre-written equations or established facts. The model is built up from a data set, which contain all sorts of texts, both fiction & non-fiction. While the training data has factual information, it is not trained to evaluate those facts, only on how it’s written.
Thus, when you can ask ChatGPT a question the AI can generate an answer that is grammatically well written, coherent, and appears correct at first glance. For basic information on a topic, those answers are likely to be mostly correct because the underlying training data most often has a general consensus of the truth.
But if you ask something more in depth on a topic (or the topic is specialized), a discerning reader can usually pick up discrepancies in the logic and spot minor things that are incorrect. When asked to generate something like a news article or academic report it can do the following:
- Misquotes from well-known figures
- Fake website page links
- References to articles that do not exist
In some cases, the facts can be outright wrong even though the response is written well.
The Expert vs AI – The Information Gap
So when it comes to knowledge of AEC topics, AI still falls short of humans, especially experienced industry experts.
That is partly because of the training data of the natural language model. Here are the following ways the data affects the model to produce information:
- Quantity – some topics simply don’t have enough information written in natural language. Much of our AEC industry expertise is still in tables, charts, plans, shop drawings, photos, and inside our expert’s heads.
- Quality – the quality of information matters as well. Those of us who have been involved in construction defect litigation for a while know how long it took for extrapolation to be used semi-correctly in claims. And there’s also the times which some people mislead their clients with falsehoods or incomplete information.
- Diversity – as mentioned before, the NLP models today use a lot of diverse sources, including stuff like old sci-fi stories with now-disproven physics or unfeasible technologies.
- Relevance – the training data needs to be relevant to the task at hand. As of 2023, we do not have a model that has been trained specifically on AEC industry expertise.
Despite all that, industry professionals who tried ChatGPT are still impressed with its capability. When they compare it to someone in their profession, they describe it as an assistant or intern that is fast at writing but has a very rough understanding of the industry.
Client Reports Won’t Be AI-Generated (yet)
Even when a company tries to make LLM focused on our industry, it doesn’t solve the basic design limitation of machine learning: the missing established facts & equations. Until that changes, any future “AEC Expert-GPT” that is asked to produce a report that requires testing, calculations, and in-depth analysis will likely result in more of a trainwreck than a time-saving operation.
The good news is dealing with facts, equations, and data is what other machine learning applications (and traditional programs) can be used for. As mentioned earlier, specific use cases are being explored and developed today in our industry. One such AI application type called computer vision is focused on detecting specific objects inside photos or a video feed. When they are proven to be reliable, they’ll become additional tools our professionals can use to better serve clients better.
LLM won’t be a fad either. A savvy expert could be use it to revise sections of the report, explaining the findings in layman terms & allegories to improve clarity. Instead of being the primary writer, LLM will act like an auto-complete the expert can use.
Concerns with using AI Content Generating Tools
While there’s a lot of excitement about the uses of large language and other ML models, a number of practitioners and critics are asking questions about its drawbacks and negative societal impacts.
Copyright & intellectual property ownership involving AI is a legitimate concern, and as reported last year by The Verge, AI-generated copyright has lots of unanswered legal questions.
The US Copyright office however, laid out their position in their Copyright Compendium Version 3 (2021), where chapter 313.2 states “the Office will not register works produced by a machine or mere mechanical process that operates randomly or automatically without any creative input or intervention from a human author.”
Another less talked about concern is the potential environmental impacts using LLM (and other ML models). These models are complex, and it takes a lot of processing power and thus energy to operate. Some open source ML models only require a relatively new GPU card, but ChatGPT and other AI services run on entire data centers. While their energy footprint pales in comparison to proof-of-work cryptocurrencies (which burned enough energy to rival industrialized nations just to keep running), as more people use it in their practices, it will add up. Some of us more environmentally conscious folks are asking questions like “How much energy does machine learning burn to write an email?”
What’s Expected in the Future
In short: A lot.
I touched mostly on the Natural Language Processing generators, but if you’ve been on social media, you’ve also likely seen machine learning generated images (Stable Diffusion, MidJourney, Dall-E, etc.), DeepFakes (ML video editing), and a wide variety of other synthetic media.
As mentioned earlier, other ML applications are being explored in the AEC industry, some of the more exciting ones include:
- Generated as-build BIM & site plans
- Jobsite photo classification
- Live project risk prediction
- Potential new construction materials coming from AI-assisted biotech advancements (in the mid-to-far future)
In this fast ever-evolving part of the industry, VERTEX embraces the Lifelong Learning value and do our best to stay at the forefront of upcoming technology advancements to find solutions to bring Meaningful Value to our clients. We look forward to bringing you more insights about AI in the AEC industry in the future.