All 3d Algorithm AWS Data Deep Learning Design diffusers DRL Generative AI Graph IFC k8s Landbook LBDeveloper LLM Machine Learning Optimization Parametricism PoC rendering Research threejs Wave Function Collapse
Finding Differentiable Environments

Finding Differentiable Environments

Abstract When implementing architectural algorithms or parametric design, it’s easy to think of them as algorithms that produce results based on parameters. However, in heuristic optimizers like GA or ES, the algorithm itself becomes an environment that changes parameters, Furthermore, when used as a criterion for model learning, and in the case of reinforcement learning, the environment becomes the standard itself for judging the model’s directionality. As part of exploring what characteristics the environment should have, we investigated the significance of differentiable environments and tested the results.

We set up two environments: one with definite gradient cutoff and another without it, And by comparing gradient descent, AdamW, and GA methods, we were able to confirm the significance of differentiable environments. Problem Creating an environment for architectural algorithms - or parametric design environments - intuitively takes the following form: Generate a circle with position and radius as parameters. Loss is defined as one-tenth of the difference between the circle's area and the area of a circle with radius 5 (25). When the logic is simple like this, there’s usually not much disconnect between parameters and results. The radius-loss graph can be expressed as follows (for x >= 0): However, as algorithms become more complex, this environment becomes a black box, and even the developers find it difficult to understand the tendencies of results based on parameters. This is where the problem arises. Even without considering reinforcement learning models, optimization processes like GA also need to “measure” results based on parameter changes. However, the randomness of parameter differences’ impact on the environment is too high, making the results based on parameter changes excessively random. The above figure shows the problem even with just one parameter. However, in architectural algorithm engines, it’s almost impossible to use just one or two parameters. This means that not only smooth curves showing tendencies are difficult to identify, but optimization modules can hardly grasp the tendencies. There will be no continuous engine without continuous [parameter - loss]. Therefore, we started by consciously trying to make the score (or negative loss) that the environment creates with parameters and their corresponding environment differentiable when adding any environment functions and parameters. Premises Including parametric design algorithms, the environment mentioned in this post refers to a module that can calculate geometry-related data and loss from input parameters. To explicitly ensure differentiability or parameter connectivity, all processes from starting parameters to loss calculation are performed only through torch tensor operations. Environment Details 1. Parameter and Loss Definition for Environment Parameter Definition - Total 16 parameters Environment 1 - x, y position ratio for 4 rectangles (x_ratio, y_ratio) corresponds to each rectangle’s width and height, ensuring one-to-one correspondence with (x, y) position. Environment 2 - interpolation, offset ratio for 4 rectangles offset operates based on 0. 5 of the corresponding width or height, ensuring one-to-one correspondence with (x, y) position generated through this. In other words, there is definite gradient cutoff. Using 0. 5 for dividing 4 quadrants in Environment 2 - Common - x, y size ratio for 4 rectangles - After position is determined, x_size and y_size are determined through ratio from remaining possible distance. - In other words, both environments look at the same search space where for every point (x, y) ∈ ℝ² in 2D plane, there exists a one-to-one corresponding parameter to generate it. Loss Definition (Each loss is multiplied by a coefficient and added to the final loss) L1: Variance of each rectangle’s area (to make each rectangle have similar areas) $$ \displaystyle L1 = \frac{1}{n} \sum_{i=1}^{n} (Area(Rectangle_i) - \mu)^2 $$ L2: Sum of overlapping areas between rectangles (to prevent rectangles from overlapping) $$ \displaystyle L2 = \sum_{i=1}^{n} \sum_{j=i+1}^{n} Area(Rectangle_i \cap Rectangle_j) $$ L3: Aspect ratio of each rectangle (deviation from 1. 5) $$ \displaystyle L3 = \sum_{i=1}^{n} \left| \frac{\max(w_i, h_i)}{\min(w_i, h_i)} - 1. 5 \right| $$ L4: Sum of areas of all rectangles (area should be large) $$ \displaystyle L4 = \sum_{i=1}^{n} Area(Rectangle_i) $$ Checking Differentiability through Primitive Gradient Descent # This is not an optimizer that modifies the learning model's weights, # but rather differentiable programming that directly updates the parameters used for generating results. optimizer = torch. optim. SGD([parameters], lr=learning_rate) sigmoid was used to keep parameters in 0 ~ 1 range. We tested both environments using a simple method of reflecting backpropagated gradients to parameters. Environment 1 showed more consistent tendencies, had smaller final loss values, and produced actual shape arrangements closer to the intended results. (However, since the Loss is not complete but rather for understanding test tendencies, we’ll focus more on the Loss numbers themselves rather than the actual shape arrangement images. ) In other words, while both environments look at the same search space, Environment 1 can be considered more differentiable. - Environment 1 Loss 1 Loss 2 Loss 3 Loss 4 - Environment 2 Loss 1 Loss 2 Loss 3 Loss 4 Now we’ll proceed with optimization using GA module and Adam optimizer with these two environments. 1. Optimization Test Results using AdamW Optimizer optimizer = torch. optim. AdamW([parameters], lr=learning_rate) - Environment 1 Loss 1 Loss 2 Loss 3 Loss 4 - Environment 2 Loss 1 Loss 2 Loss 3 Loss 4 2. Optimization Test Results using Genetic Algorithm In GA, the final results are similar. With sufficient computation, since the search space itself is the same, we can confirm that they converge to almost identical scores. However, it’s often impossible to provide sufficient computation every time. Moving quickly towards the answer is also important. We can confirm that Environment 1 definitely has better initial stability. Difference in initial graphs between two environments Moreover, in GA we used a Population of 100 per generation. In other words, this is the result of performing 20,000 computations, unlike the previous two cases which only performed 200 computations. While using GA might be one way to guarantee final performance, it’s difficult to consider it an efficient method at least among these three. (Only 200 computations in 2 generations) - Environment 1 Loss 1 Loss 2 Loss 3 Loss 4 - Environment 2 Loss 1 Loss 2 Loss 3 Loss 4 Conclusion We confirmed that differentiable environments, where changes in parameters lead to continuous changes in the environment or results, can achieve better results in the optimization process. This tendency was observed not only in simple gradient descent or AdamW, but even in optimization algorithms like GA. While it may not be possible to apply this to all parameters, we reaffirmed that it’s meaningful to consciously try to add parameters in a differentiable form to the environment when possible during development. To achieve similar scores, GA required many more execution counts. This suggests there might be more efficient methods than GA. We expect to be able to reduce the total number of computations through differentiable environments and methods or other approaches. Let’s Use Differentiable Environment! . .


Quantifying Architectural Design

Quantifying Architectural Design

1.

Measurement: Why It Matters In the context of space and architecture, “measuring and calculating” goes beyond merely writing down dimensions on a drawing. For instance, an architect not only measures the thickness of a building’s walls or the height of each floor, but also observes how long people stay in a given space and in which directions they tend to move. In other words, the act of measuring provides crucial clues for comprehensively understanding a design problem. Of course, not everything in design can be fully grasped by numbers alone. However, as James Vincent emphasizes, the very act of measuring makes us look more deeply into details we might otherwise overlook. For example, the claim “the bigger a city gets, the more innovation it generates” gains clarity once supported by quantitative data. Put differently, measurement in design helps move decision-making from intuition alone to a more clearly grounded basis. 2. The History of Measurement Leonardo da Vinci’s Vitruvian Man Let’s take a look back in time. In ancient Egypt, buildings were constructed based on the length from the elbow to the tip of the middle finger. In China, the introduction of the “chi” (Chinese foot) allowed large-scale projects like the Great Wall to proceed with greater precision. Moving on to the Renaissance, Leonardo da Vinci’s Vitruvian Man highlights the fusion of art and mathematics. Meanwhile, Galileo Galilei famously remarked, “Measure what is measurable, and make measurable what is not,” showing how natural phenomena were increasingly described in exact numerical terms. In the current digital age, we can measure at the nanometer scale, and computers and AI can analyze massive datasets in an instant. If artists of old relied on their finely honed intuition, we can now use 3D printers to produce structures with a margin of error of less than 1 mm. All of this underscores the importance of measurement and calculation in modern urban planning and architecture. 3. The Meaning of Measurement British physicist William Thomson (Lord Kelvin) once stated that if something can be expressed in numbers, it implies a certain level of understanding; otherwise, that understanding is incomplete. In this sense, measurement is both a tool for broadening and deepening our understanding, and also a basis for conferring credibility on the knowledge we gain. Even in architecture, structural safety relies on calculating loads and understanding the physical properties of materials. When aiming for energy efficiency, predicting daylight levels or quantifying noise can guide us to the optimal design solution. In urban studies, as Geoffrey West points out in his book Scale, the link between increasing population and heightened rates of patent filings or innovation is not merely a trend, but a statistically verifiable reality—one that helps guide city design more clearly. However, interpretations can vary drastically depending on what is measured and which metrics one chooses to trust. As demonstrated by the McNamara fallacy, blind belief in the wrong metrics can obscure the very goals we need to achieve. Because architecture and urban design deal with real human lives, they must also consider values not easily reducible to numbers—such as emotional satisfaction or a sense of place. This is why Vaclav Smil cautions that we must understand what the numbers truly represent: we shouldn’t look at numbers alone but also the context behind them. 4. The “Unmeasurable” In design, aesthetic sense is a prime example of what’s difficult to quantify. Beauty, atmosphere, user satisfaction, and social connectedness all have obvious limits to numerical representation. Still, there have been attempts to systematize such aspects—for instance, Birkhoff’s aesthetic measure (M = O/C, where O = order, C = complexity). Some methods also translate survey feedback into ratings or scores. Yet qualitative context and real-world experiences are not easily captured in raw numbers. Informational Aesthetics Measures Design columnist Gillian Tett has warned that over-reliance on numbers can make us miss the bigger context. We need, in other words, to connect “numbers and the human element. ” Architectural theorist K. Michael Hays notes that although the concept of proportion from ancient Pythagorean and Platonic philosophy led to digital and algorithmic design in modern times, we still cannot fully translate architectural experience into numbers. Floor areas or costs might be represented numerically, but explaining why a particular space is laid out in a certain way cannot be boiled down to mere figures. In the end, designers must grapple with both the quantitative and the “not-yet-numerical” aspects to create meaningful spaces. 5. Algorithmic Methods to Measure, Analyze, and Compute Space Lately, there has been a rise in sophisticated approaches that integrate graph theory, simulation, and optimization into architectural and urban design. For example, one might represent a building floor plan as nodes and edges in order to calculate the depth or accessibility between rooms, or use genetic algorithms to automate layouts for hospitals or offices. Researchers also frequently simulate traffic or pedestrian movement at the city scale to seek optimal solutions. By quantifying the performance of a space through algorithms and computation, designers can make more rational and creative choices based on those data points. Spatial Networks Representing building floor plans or urban street networks using nodes and edges allows the evaluation of connectivity between spaces (or intersections). For instance, if you treat doors as edges between rooms, you can apply shortest-path algorithms (like Dijkstra or A*) to assess accessibility among different rooms. Spatial networks Justified Permeability Graph (JPG) To measure the spatial depth of an interior, you can choose a particular space (e. g. , the main entrance) as a root, then lay out a hierarchical justified graph. By checking how many steps it takes to reach each room from that root (i. e. , the room’s depth), one can identify open/central spaces versus isolated spaces. Justified Permeability Graph (JPG) Visibility Graph Analysis (VGA) Divide the space into a fine grid and connect points that are visually intervisible, thus creating a visual connectivity network. Through this, you can predict which spots have the widest fields of view and where people might be most likely to linger. Axial Line Analysis Divide interior or urban spaces into the longest straight lines (axes) along which a person can actually walk, then treat intersections of these axial lines as edges in a graph. Using indices such as Integration or Mean Depth helps you estimate which routes are centrally located and which spaces are more private. Genetic Algorithm (GA) For building layouts or urban land-use planning, multiple requirements (area, adjacency, daylight, etc. ) can be built into a fitness function so that solutions evolve through random mutations and selections, moving closer to an optimal layout over successive generations. Reinforcement Learning (RL) Treat pedestrians or vehicles as agents, and define a reward function for desired objectives (e. g. , fast movement or minimal collisions). The agents then learn the best routes by themselves. This is particularly useful for simulating crowd movement or traffic flow in large buildings or at urban intersections. 6. Using LLM Multimodality for Quantification With the advancement of AI, there is now an effort to have LLMs (Large Language Models) analyze architectural drawings or design images, partly quantifying qualitative aspects. For instance, one can input an architectural design image and compare it to text-based requirements, then assign a “design compatibility” score. There are also pilot programs that compare building plans and automatically determine whether building regulations are met and calculate the window-to-wall ratio. Through such processes, things like circulation distance, daylight area, and facility accessibility can be cataloged, ultimately helping designers decide, “This design seems the most appropriate. ” 7. A Real-World Example: Automating Office Layout Automating Office Layout Automating the design of an office floor plan can be roughly divided into three steps. First, an LLM (Large Language Model) interprets user requirements and translates them into a computer-readable format. Next, a parametric model and optimization algorithms generate dozens or even hundreds of design proposals, which are then evaluated against various metrics (e. g. , accessibility, energy efficiency, or corridor length). Finally, the LLM summarizes and interprets these results in everyday language. In this scenario, an LLM (like GPT-4) that has been trained on a massive text corpus can convert a user’s specification—“We need space for 50 employees, 2 meeting rooms, and ample views from the lobby”—into a script that parametric modeling tools understand or provide guidance on how to modify the code. Zoning diagram for office layout by architect Let’s look at a brief example. In the past, an architect would have to repeatedly sketch bubble diagrams and manually adjust “Which department goes where? Who gets window access first?” But with an LLM, you can directly transform requirements—like “Maximize the lobby view and place departments with frequent collaboration within 5 meters of each other”—into a parametric model. Using basic information such as columns, walls, and window positions, the parametric model employs rectangle decomposition and bin-packing algorithms (among others) to generate various office layouts. Once these layouts are automatically evaluated on metrics such as adjacency, distance between collaborating departments, and the ratio of window usage, the optimization algorithm ranks the solutions with the highest scores. LLM-based office layout generation If humans were to sift through all these data manually—checking average inter-department distances (5 m vs. a target of 3–4 m)—it would get confusing quickly. This is where the LLM comes in again. It can read the optimization results and summarize them in easy-to-understand statements such as: “This design meets the 25% collaboration space requirement, but its increased window use may lead to higher-than-expected energy consumption. ” Some example summaries might include: “The average distance between collaborating departments is 5 meters, which is slightly more than your stated target of 3–4 meters. ” “Energy consumption is estimated to be about 10% lower than the standard for an office of this size. ” If new information comes in, the LLM can use RAG (Retrieval Augmented Generation) to review documents and instruct the parametric model to include additional constraints—like “We need an emergency stairwell” or “We must enhance soundproofing. ” Through continuous interaction between human and AI, designers can explore a wide range of alternatives in a short time and leverage “data-informed intuition” to make final decisions. results of office layout generation Essentially, measurement is a tool for deciding “how to look at space. ” Although determining who occupies which portions of an office with what priorities can be complex, combining LLMs, parametric modeling, and optimization algorithms makes this complexity more manageable. In the end, the designer or decision-maker does not lose freedom; rather, they gain the ability to focus on the broader creative and emotional aspects of a project, freed from labor-intensive repetition. That is the promise of an office layout recommendation system powered by “measurement and calculation. ” 8. Conclusion Ultimately, measurement is a powerful tool for “revealing what was previously unseen. ” Having precise numbers clarifies problems, allowing us to find solutions more objectively than when relying on intuition alone. In the age of artificial intelligence, LLMs and algorithms extend this process of measurement and calculation, making it possible to quantify at least part of what was once considered impossible to measure. As seen in the automated office layout example, even abstract requirements become quantifiable parameters thanks to LLMs, and parametric models plus optimization algorithms generate and evaluate numerous design options, proposing results that are both rational and creative. . .


AI and Architectural Design: Present and Future

AI and Architectural Design: Present and Future

Introduction Artificial intelligence (AI) is making waves across a broad range of fields, from automating repetitive tasks to tackling creative problem-solving. Architecture is no exception.

What used to be simple computer-aided drafting (CAD) has now evolved into generative design and optimization algorithms that are transforming the entire workflow of architects and engineers. This article explores how AI is bringing innovation to architectural design, as well as the opportunities and challenges that design professionals face. Background of AI in Architectural Design Rule-Based Approaches and Shape Grammar Early research on automated design relied on rules predefined by humans. Shape Grammar, which emerged around the 1970s, demonstrated that even a limited set of rules could replicate the style of a particular architect. One famous example was using basic geometric rules to recreate the floor plans of Renaissance architect Palladio’s villas. However, because these systems only operate within the boundaries of predefined rules, tackling highly complex architectural problems remained difficult. . Koning's and Eisenberg's compositional forms for Wright's prairie-style houses. Genetic Algorithms and Deep Reinforcement Learning Inspired by natural evolution, genetic algorithms evaluate random solutions, then breed and mutate the best ones to progressively improve outcomes. Although they are powerful for exploring complex problems, they become time-consuming when many variables are involved. Deep reinforcement learning, popularized by the AlphaGo AI for the game of Go, learns policies that maximize rewards through trial and error. In architecture, designers can use these techniques to create decision-making rules based on context—site shape, regulations, design forms, and more. Generative Design and the Design Process Automating Repetitive Tasks and Proposing Alternatives Generative design automates repetitive tasks and simultaneously produces a wide range of design options. Traditionally, time constraints meant only a few ideas could be tested. With AI, it’s possible to experiment with dozens—or even hundreds—of designs at once, while also evaluating factors such as sunlight, construction cost, functionality, and floor area ratio. The Changing Role of the Architect The architect’s role in an AI-driven environment is to set objectives and constraints for the AI system, then curate the generated results. Rather than merely drafting drawings or performing calculations, the architect configures the AI’s design environment and refines the best solutions by adding aesthetic sense. The Role of Architects in the AI Era In 1966, British architect Cedric Price posed a question: “Technology is the answer… but what was the question?” This statement is more relevant today than ever. The core challenge is not “How do we solve a problem?” but rather “Which problem are we trying to solve?” While advances in AI have largely focused on problem-solving, identifying and defining the right problem is critical in architectural design. Defining the Problem and Constructing a State Space Someone with both domain knowledge and computational thinking is best suited to define these problems. With a precisely formulated problem, even a simple algorithm can suffice for effective results—no need for overly complex AI. In this context, the architect’s role shifts from producing a single, original masterpiece to designing the interactions between elements—that is, shaping the state space. Meanwhile, AI explores and discovers the best combination among countless existing possibilities. Palladio’s Villas and Shape Grammar A classic example is Rudolf Wittkower’s analysis of Renaissance architect Andrea Palladio’s villas. Wittkower uncovered recurring geometric principles, such as the “nine-square grid,” symmetrical layouts, and harmonic proportions (e. g. , 3:4 or 2:3). While Palladio’s villas appear individually unique, they share particular rules under the hood. Researchers later turned these observations into shape grammar, enabling computers to automatically generate Palladio-style floor plans—demonstrating that once hidden rules are clearly extracted, computers can “discover” new variations by applying them systematically. Landbook and LBDeveloper Spacewalk is a company developing AI-powered architectural design tools. In 2018, they released Landbook, an AI-based real-estate development platform that can instantly show property prices, legal development limits, and expected returns. LBDeveloper, on the other hand, is designed for small housing redevelopment projects—automating the process of checking feasibility and profitability. Simply entering an address reveals whether redevelopment is possible and the potential returns. AI-Based Design Workflow LBDeveloper calculates permissible building envelopes based on regulations, then generates building blocks using various grid and axis options. It arranges these blocks, optimizes spacing to avoid collisions, and evaluates efficiency using metrics like the building coverage ratio (BCR) and floor area ratio (FAR). The final step is choosing the best-performing design among the candidates. The total number of combinations is calculated as (Number of Axis Options) × (Number of Grid Options) × (Number of Floor Options) × (Row Shift × Column Shift Options) × (Rotation Options) × (Additional Block Options). For instance, a rough combination could be 4 × 100 × 31 × (10×10) × 37 × (varies by active blocks). In practice, each stage performs its own optimization process rather than brute-forcing all possible combinations. Future Prospects Design inherently entails an infinite variety of solutions, making it impossible to reach a perfect optimum when bound by real-world constraints. Even optimization algorithms struggle to guarantee satisfaction when faced with vast search spaces or strict regulations. Large Language Models (LLMs) offer a way to mitigate these problems by generating new ideas and picking best-fit solutions from a range of options. In fields like architecture, where complicated rules are intertwined, LLMs can be especially powerful. LLMs for Design Evaluation and Decision LLMs can take pre-generated design proposals and evaluate them across various criteria—legal requirements, user demands, visual balance, circulation efficiency, and more. This kind of automated review process is particularly useful when there are numerous candidates. Without AI, an architect must manually inspect each option, but with an LLM’s guidance, it becomes easier to quickly identify viable designs. Moreover, LLMs are prompt-based, so you only need to phrase your requirements in natural language—for instance, “Please check if there’s adequate distance between our building and adjacent properties. ” The LLM then analyzes the condition and summarizes the results. Simultaneously, it considers spatial concepts along with functional requirements, measuring how the proposed design meets the architect’s goals and suggesting improvements. Ultimately, the LLM allows designers to focus on more profound aspects of design. Architects can rapidly verify whether a concept is feasible, whether regulations and practical constraints are satisfied, and whether the design is aesthetically sound. By taking care of details that might otherwise go unnoticed, AI encourages broader design exploration. We can expect such partnerships between human experts and LLMs—where the AI evaluates and narrows down countless possibilities—to become increasingly common. . .


Rendering Multiple Geometries

Rendering Multiple Geometries

Introduction In this article, we will introduce methods to optimize and reduce rendering load when rendering multiple geometries.

Abstract 1. The most basic way to create an object corresponding to a shape in three. js is to create a single mesh through geometry and material corresponding to the mesh. 2. At this time, due to the excessive number of draw calls, it was too slow, so we attempted the first optimization by merging geometries by material to reduce the number of meshes. 3. Additionally, for geometries that can be generated through transformations such as move, rotate, and scale from a reference geometry, we attempted to improve memory usage and reduce load during geometry creation by using instancedMesh instead of merging. 1. Basic Mesh Creation Let's assume we are rendering a single apartment. There are many types of shapes that form an apartment, but here we will use a block as an example. When rendering a single block as shown below, there is no need for much consideration. . . . const [geometry, material] = [new THREE. BoxGeometry(20, 10, 3), new THREE. MeshBasicMaterial({ color: 0x808080, transparent: true, opacity: 0. 2 })]; const cube = new THREE. Mesh(geometry, material); scene. add(cube); . . . Draw Calls: 0 Mesh Creation Time: 0 However, when rendering an entire apartment, the geometry of a single unit, such as windows and walls, often exceeds 100, which means that if you want to render 100 units, the number of geometries can easily exceed five digits. This is the case when creating one mesh per geometry for rendering. Below is the result of rendering 10,000 shapes. Each geometry was created by cloning and translating a base shape. // The code below is executed 5,000 times. The number of meshes added to the scene is 10,000. . . . const geometry_1 = base_geometry_1. clone(). translate(i * x_interval, j * y_interval, k * z_interval); const geometry_2 = base_geometry_2. clone(). translate(i * x_interval, j * y_interval, k * z_interval + 3); const cube_1 = new THREE. Mesh(geometry_1, material_1) ; const cube_2 = new THREE. Mesh(geometry_2, material_2); scene. add(cube_1); scene. add(cube_2); . . . Draw Calls: 0 Mesh Creation Time: 0 2. Merge Geometries Now, I will talk about one of the commonly used methods for rendering optimization, which is reducing the number of meshes. In graphics, there is a concept called draw call. The CPU finds the shapes to be rendered in the scene and requests the GPU to render them, and the number of these requests is called a draw call. This can be understood as the number of requests to render different meshes with different materials. Here, the CPU, which is not specialized in multitasking, may experience bottlenecks as it has to handle many calls simultaneously. Source: https://joong-sunny. github. io/graphics/graphics/#%EF%B8%8Fdrawcall By manipulating the scene with the mouse, you can check the difference in draw calls between this case and the previous case. In the above case, due to reasons such as the max distance of the camera set or shapes going out of the screen, it is not always called 10,000 times, but in most cases, a significant amount of calls occur. Here, since two materials were used, we used a method to merge geometries. Therefore, as you can see, the draw call here is fixed at a maximum of 2. // The code below is executed once. The number of meshes added to the scene is 2. . . . const mesh_1 = new THREE. Mesh(BufferGeometryUtils. mergeBufferGeometries(all_geometries_1), material); const mesh_2 = new THREE. Mesh(BufferGeometryUtils. mergeBufferGeometries(all_geometries_2), material_2); scene. add(mesh_1); scene. add(mesh_2); . . . Draw Calls: 0 Mesh Creation Time: 0 You can directly confirm that the rendering load has been improved. 3. InstancedMesh Above, we confirmed a method to reduce the load from the perspective of draw calls by merging geometries. These shapes are composed only of shapes using clone and translate, so the base form is the same. This also occurs when rendering objects such as buildings or trees. For example, when there is no need for different forms per floor, or when parts are copied to form the walls of a floor, or when a single type of tree is used with only size or direction changes. It can also be applied when the same form of plane is generated into multiple buildings. In this case, you can further optimize by using instancedMesh, which is efficient in terms of memory and time, reducing the number of geometry object creations. There is a concept called Instancing in graphics. Instancing allows you to render similar geometries multiple times by sending the data to the GPU once and additionally sending the transformation information of each instance to the GPU, which means you only need to create the geometry once. Game engines like Unity also achieve performance optimization using this concept through a feature called GPU instancing. Rendering multiple similar shapes (Unity GPU Instancing) Source: https://unity3d. college/2017/04/25/unity-gpu-instancing/ three. js also has a feature called InstancedMesh that corresponds to this. The draw call is the same as the geometries merge case mentioned above, but there is a significant advantage in terms of memory and rendering time as there is no need to create each geometry separately. The diagram below simplifies the meshes sent to the GPU in processes 1, 2, and 3. Although only translation was used in the example below, you can also use transformations such as rotate and scale. In addition to reducing the creation of separate meshes, you can also reduce the creation of separate geometries, confirming a significant difference in creation time. // You need to add the number of shapes to be used as the last argument of instancedMesh. const mesh_1 = new THREE. InstancedMesh(base_geometry_1, material_1, x_range * y_range * z_range); const mesh_2 = new THREE. InstancedMesh(base_geometry_2, material_2, x_range * y_range * z_range); let current_total_index = 0; for (let i = 0; i Draw Calls: 0 Mesh Creation Time: 0 Of course, transformations such as translate (move), rotate (rotation), and scale (size transformation) cannot cover all cases, so in many cases, the merged geometry method must also be mixed and applied. Conclusion There are various methods to optimize three. js rendering, but in this article, we focused on methods to reduce the rendering burden according to the number of geometries. In addition, I will share my experiences of fixing load issues caused by memory leaks or other reasons in the future. Thank you. . .


Zone Subdivision With LLM - Expanded Self Feedback Cycle

Zone Subdivision With LLM - Expanded Self Feedback Cycle

Introduction In this post, we explore the use of Large Language Models (LLMs) in a feedback loop to enhance the quality of results through iterative cycles.

The goal is to improve the initial intuitive results provided by LLMs by engaging in a cycle of feedback and optimization, extending beyond the internal feedback mechanisms of LLMs. Concept LLMs can improve results through self-feedback, a widely utilized feature. However, relying solely on a single user request for complete results is challenging. By re-requesting and exchanging feedback on initial results, we aim to enhance the quality of responses. This process involves not only internal feedback within the LLM but also feedback in a larger cycle that includes algorithms, continuously improving results. Self-feedback outside LLM API In This Work: Self-feedback inside LLM API examples: Significance of Using LLMs: Acts as a bridge between vague user needs and specific algorithmic criteria. Understands cycle results and reflects them in subsequent cycles to improve responses. Premises Two Main Components: LLM: Used for intuitive selection. Bridges the gap between the user’s relatively vague intentions and the optimizer’s specific criteria. Heuristic Optimizer (GA): Used for relatively specific optimization. Use Cycles: By repeating cycle execution, we aim to create a structure that approaches the intended results independently. Details Parameter Conversion: Convert intuitive selections into clear parameter inputs for algorithm input using LLM and structured output. Adjust the number of additional subdivisions based on zone grid and grid size. number_additional_subdivision_x: int number_additional_subdivision_y: int Prioritize placement close to boundaries for each use based on prompts. place_rest_close_to_boundary: bool place_office_close_to_boundary: bool place_lobby_close_to_boundary: bool place_coworking_close_to_boundary: bool Ask about the desired percentage of mixture for different uses in adjacent patches. percentage_of_mixture: number Inquire about the percentage each use should occupy. office: number coworking: number lobby: number rest: number Optimization Based on LLM Responses: Use LLM responses as optimization criteria, transforming user thoughts into specific criteria. Incorporate Optimization Results into Next Cycle: Insert optimization results as references when asking LLMs in the next cycle, reinforcing the significance of multiple [LLM answer - optimize results with the answer] cycles. Cycle Structure: One Cycle: First [LLM answer - optimize results with the answer]: Ask LLM about subdivision criteria based on the prompt before optimizing zone usage. Second [LLM answer - optimize results with the answer]: Request intuitive answers from LLM regarding zone configuration based on the prompt. After the Second Cycle: After the second cycle, directly request improvement in responses by providing the actual optimization results along with the previous LLM responses. Test Results Case 1 Prompt: In a large space, the office spaces are gathered in the center for a common goal. I want to place the other spaces close to the boundary. GIFs: Case 2 Prompt: The goal is to have teams of about 5 people, working in silos. Therefore, we want mixed spaces where no single use is concentrated. GIFs: Case 3 Prompt: Prioritize placing the office close to the boundary to easily receive sunlight, and place other spaces inside. GIFs: Conclusion This work aims to improve final results by expanding the scope of Self Feedback beyond LLMs to include LLM requests, optimization, and post-processing. As cycles repeat, results approach the intended outcomes, reducing the potential incompleteness of initial values relied upon by LLMs. . .


Landbook Diffusion Pipeline

Landbook Diffusion Pipeline

Introduction Landbook is a service that supports all the steps of new construction development for small and medium-sized land investors. Landbook's AI architect service provides building owners with various architectural design proposals by considering different plot sizes, zoning areas, and building regulations for each region. Landbook's AI architect service This project is to develop a pipeline that renders the final result of the Landbook AI architect service by using a generative image model such as diffusers.

By taking 3D modeling data as input and generating realistic images that closely resemble actual buildings, it allows building owners to visualize and review what their design proposals would look like when actually constructed. Unlike conventional 3D rendering, this aims to provide high-quality visualization that considers both the texture of actual buildings and their harmony with the surrounding environment by utilizing AI-based generative models. Pipeline Overview The comprehensive pipeline below ensures that the final output not only accurately represents the architectural design but also provides a realistic visualization that helps building owners better understand how the building will look in reality. The pipeline consists of the following steps: 2D Plans Generation: The process begins with generating 2D floor plans that serve as the foundation for the building design. 3D Building Generation: The 2D plans are transformed into a 3D building model with proper dimensions and structure. Three. js Plot: The 3D model is plotted in Three. js, allowing for visualization and manipulation. Camera Adjustment: The viewing angle and camera position are carefully adjusted to capture the building from the most appropriate perspective. Scale Figures: Human figures, trees, and vehicles are added to provide scale reference and context to the scene. Masking: Different parts of the building and environment are masked with distinct colors to define materials and surfaces. Canny Edge Detection: Edge detection is applied to create clear building outlines and details. Highlighting: Important architectural features and edges are emphasized through highlighting and hatching. Base Image Generation: A base image with proper shading and basic textures is created. Inpainting & Refining: Multiple iterations of inpainting and refinement are performed to add realistic textures and details. Pipeline Diagram Camera Position Estimation The camera position estimation is a crucial step in capturing the building from the most effective viewpoint. The algorithm determines the camera position appropriately by considering the building's dimensions, plot layout, and road positions. Road-Based Positioning Identifies the widest road adjacent to the building plot Uses the road's centroid as a reference point for camera placement Ensures the building is viewed from a natural street-level perspective Vector Calculation Creates a horizontal vector aligned with the widest road (X vector) Creates a vertical vector by rotating the horizontal vector 90 degrees (Y vector) These vectors form the basis for determining the camera's viewing direction Height Determination and Distance Calculation Calculates optimal camera height using two criteria Selects the maximum value between these criteria to ensure proper building coverage Uses trigonometry to compute the ideal distance between camera and building as follows. \[ \tan(\theta) = \frac{h}{d}, \quad d = \frac{h}{\tan(\theta)} \] where \(d\) is the distance between camera and widestRoadCentroid, \(h\) is the height of the camera, and \(\theta = \frac{\text{fov}}{2} \times \frac{\pi}{180}\) Camera Position Estimation Diagram const estimateCameraPosition = ( data: BuildingStateInfo, buildingHeightEstimated: number, fov: number, ) => { const parcelPolygon = data. plotOutline // Obtain the widest road object. // The widest road object is computed by widthRaw + edgeLength let widestWidth = -Infinity; let widestRoad = undefined; data. roadWidths. forEach((road) => { if (widestWidth Scale Figures Scale figures serve as essential contextual elements for the diffusion model to understand and generate more realistic architectural visualizations. By incorporating human figures, trees, and vehicles into the scene, we provide the model with crucial reference points that help it comprehend the spatial relationships and scale of the architecture it needs to generate. Scale Figures The presence of these contextual elements also guides the model in generating appropriate lighting, shadows, and atmospheric effects. When the model sees a human figure or a tree in the scene, it can better interpret the scale of lighting effects and environmental interactions that should be present in the final rendering. This helps create more convincing and naturally integrated architectural visualizations. In our pipeline, these scale elements are placed before the diffusion process begins. The model uses these references to better understand the intended size and proportions of the building, which significantly improves the quality and accuracy of the generated images. Human figures are particularly important as they provide the diffusion model with a scale reference that helps maintain consistent and realistic proportions throughout the generation process. Landbook AI Architect result w/ and w/o scale figures Material Masking ShaderMaterial provided by three. js is used to mask the materials of the building and the environment. ShaderMaterial is a material rendered with custom shaders. A shader is a small program written in GLSL that runs on the GPU. Since ShaderMaterial allows users to write custom shaders, we can create specialized masking materials by defining specific colors for different architectural elements. These masking materials help segment the 3D model into distinct parts that can be processed separately by the diffusion model. Material Masking const createMaskMaterial = (color: number) => { return new THREE. ShaderMaterial({ uniforms: { color: { value: new THREE. Color(color) } }, vertexShader: ` void main() { gl_Position = projectionMatrix * modelViewMatrix * vec4(position, 1. 0); } `, fragmentShader: ` uniform vec3 color; void main() { gl_FragColor = vec4(color, 1. 0); } ` }); }; export const glassMaskMaterial = createMaskMaterial(0xff0000); // red export const glassPanesMaskMaterial = createMaskMaterial(0x00ffff); // cyan export const columnsMaskMaterial = createMaskMaterial(0xff00ff); // magenta export const wallMaskMaterial = createMaskMaterial(0x0000ff); // blue export const surroundingBuildingsMaskMaterial = createMaskMaterial(0xffff00); // yellow export const surroundingParcelsMaskMaterial = createMaskMaterial(0xff8e00); // orange export const roadMaskMaterial = createMaskMaterial(0x000000); // black export const siteMaskMaterial = createMaskMaterial(0x00ff00); // green export const railMaskMaterial = createMaskMaterial(0xbc00bc); // purple export const carMaskMaterial = createMaskMaterial(0xbcbcbc); // gray export const treeMaskMaterial = createMaskMaterial(0xc38e4d); // brown export const personMaskMaterial = createMaskMaterial(0x00a800); // darkgreen export const pathMaskMaterial = createMaskMaterial(0xc0e8f6); // skyblue export const parkingLineMaskMaterial = createMaskMaterial(0x000080) // darkblue EndpointHandler 🤗 The diffusion process in our pipeline utilizes multiple specialized models from the HuggingFace Diffusers library to generate photorealistic architectural visualizations. The process consists of three main stages: initial generation, targeted inpainting, and final refinement. The pipeline begins with StableDiffusionXLControlNetPipeline using a ControlNet model trained on canny image. Canny edge detection, highlighting the main building, hatching some parts This stage takes the edge information from our 3D model and generates a base image. The ControlNet helps ensure that the generated base image follows the precise geometric outlines of the building design with the help of the prompt: self. prompt_positive_base = ", ". join( [ "", "[Bold Boundary of given canny Image is A Main Building outline]", "[Rich Street Trees]", "[Pedestrians]", "[Pedestrian path with hatch pattern paving stone]", "[Driving Cars on the asphalt roads]", "At noon", "[No Clouds at the sky]", "First floor parking lot", "glass with simple mullions", "BLENDED DIVERSE ARCHITECTURAL MATERIALS", "Korean city context", "REALISTIC MATERIAL TEXTURE", "PROPER PERSPECTIVE VIEW", "PROPER ARCHITECTURAL SCALE", "8k uhd", "masterpiece", "[Columns placed at the corner of the main building]" "best quality", "ultra detailed", "professional lighting", "Raw photo", "Fujifilm XT3", "high quality", ] ) After the initial generation, the pipeline performs a series of targeted inpainting operations using StableDiffusionXLInpaintPipeline. The inpainting process follows a specific order to handle different architectural elements. Each inpainting step uses crafted prompts and masks to ensure appropriate material textures and architectural details are generated for each element. After each inpainting step, it is merged with the base image to create a new base image. Road surfaces with asphalt texturing Surrounding parcels and pedestrian paths Background elements including sky Surrounding buildings with appropriate architectural details ( . . . ) Masked images The last stage uses StableDiffusionXLImg2ImgPipeline to refine the overall image, enhancing the coherence and realism of the rendering image. This refinement process focuses on improving overall image quality through better resolution and detail enhancement. It adjusts lighting and shadows to create more natural and realistic effects, ensures consistent material appearances across different surfaces of the building, and fine-tunes architectural details to maintain design accuracy. These refinements work together to produce a final image that is both architecturally accurate and visually compelling. Results After applying the multi-stage diffusion pipeline described above, we can get the following results which demonstrate the effectiveness of our approach in generating high-quality architectural renderings with consistent materials, lighting, and architectural details. Future Works While our current pipeline successfully generates realistic architectural rendering images, there are several areas for potential improvement and future development: Material Diversity Enhancement: The system could be improved to handle more diverse surrounding building facade textures and materials, along with better material interaction and weathering effects to create more realistic environmental contexts. Sky Condition Variations: Future development could include support for different times of day, various weather effects and cloud patterns, and dynamic atmospheric conditions to provide more options for visualization scenarios. Road Detail Improvements: The pipeline could be enhanced to generate more detailed road surfaces, including various pavement types, road markings, surface wear patterns, and better integration with surrounding elements. . .


Floor Plan generation with Voronoi Diagram

Floor Plan generation with Voronoi Diagram

Introduction This project is to review and implement the paper for Free-form Floor Plan Design using Differentiable Voronoi Diagram.

In Deep learning or any gradient-based optimization approach, it uses only tensors to compute gradients, but I think it is not intuitive in geometries. Therefore, I aim to integrate the tensor operations and the geometric operations using Pytorch, Shapely. The biggest difference between the paper and this project is whether using autograd. In the paper, they used the Differentiable Voronoi Diagram for chaining gradient flow, but, I used Numerical Differentiation to approximate the gradients directly. Floor plan generation with voronoi diagram So, what is the numerical differentiation? Numerical Differentiation Numerical differentiation is a method used to approximate a derivative using finite perturbation differences. Unlike automatic differentiation by differentiable voronoi diagram used in the original paper, this approach calculates derivatives by evaluating the function at multiple nearby points. There are three basic methods for numerical differentiation. In this, central difference method is used to compute gradient. Basic methods for the numerical differentiation Central difference method: \[ \begin{align*} \,\\ f'(x) &= \lim_{h \, \rightarrow \, 0} \, \frac{1}{2} \cdot \left( \frac{f(x + h) - f(h)}{h} - \frac{f(x - h) - f(h)}{h} \right) \\\,\\ &= \lim_{h \, \rightarrow \, 0} \, \frac{1}{2} \cdot \frac{f(x + h) - f(x - h)}{h} \\\,\\ &= \lim_{h \, \rightarrow \, 0} \, \frac{f(x + h) - f(x - h)}{2h} \,\\ \end{align*} \] The \(h \,(\text{or } dx)\) in the expression is a perturbation value that determines the accuracy of the approximation. As \(h\) approaches zero, the numerical approximation gets closer to the true derivative. However, in practice, we cannot use an infinitely small value due to computational limitations and floating-point precision. Choosing an appropriate step size is crucial. Too large values lead to poor approximations, while too small values can cause numerical instability due to rounding errors. A stable perturbation value typically ranges from \(h = 10^{-4}\) to \(h = 10^{-6}\). In this implementation, I used \(h = 10^{-6}\) as the perturbation value. Expression of Loss functions In the paper, the key loss functions to optimize floor plans consists of four parts. The below contents are excerpted from the paper: Wall loss: As the unconstrained Voronoi diagram typically produce undesirable fluctuations in the wall orientations, we design a tailored loss function to regularize the wall complexity. Inspired by the Cubic Stylization, we regularize the \(\mathcal{L}_1\) norm of the wall length. \(L_1\) norm is defined as \(v_x + v_y\) (norm of \(x\) + norm of \(y\)), therefore the \(\mathcal{L}_{\text{wall}}\) has the minimal when vector \(\mathbb{v}_j - \mathbb{v}_i\) is vertical or horizontal. \[ \,\\ \mathcal{L}_{\text{wall}} = w_{\text{wall}} \sum_{(v_i, v_j) \, \in \, \mathcal{E}} ||\, \mathbb{v}_i - \mathbb{v}_j \,||_{L1} \,\\ \] where \(\mathcal{E}\) denotes the set of edges of the Voronoi cells between two adjacent rooms and the \(\mathbb{v}_i\) and \(\mathbb{v}_j\) denote the Voronoi vertices belonging to the edge. Area loss: The area of each room is specified by the user. We minimize the quadratic difference between the current room areas and the user-specified targets. Here, \(\bar{A}_r\) denotes the target area for the room \(r\). \[ \,\\ \mathcal{L}_{\text{area}} = w_{\text{area}} \sum_{r=1}^{\#Room} ||\, A_r(\mathcal{V}) - \bar{A}_r \,||^2 \,\\ \] Lloyd loss: To regulate the site density, we design a loss function inspired by the Lloyd's algorithm. Here, \(\mathbb{c}_i \) denotes the centroid of the \(i\)-th Voronoi cell. This is useful for attracting these exterior sites inside \(\Omega\). \[ \,\\ \mathcal{L}_{\text{Lloyd}} = w_{\text{Lloyd}} \sum_{i=1}^N ||\, \mathbb{s}_i - \mathbb{c}_i \,||^2 \,\\ \] Topology loss: We design the topology loss such that each room is a single connected region, and the specified connections between rooms are achieved. We move the site to satisfy the desired topology by setting the goal position \(\mathbb{t}_i\) for each site \(\mathbb{s}_i\) as \[ \,\\ \mathcal{L}_{\text{topo}} = w_{\text{topo}} \sum_{i=1}^N ||\, \mathbb{s}_i - \mathbb{t}_i \,||^2 \,\\ \] The goal position \(\mathbb{t}_i\) can be automatically computed as the nearest site to the site from the same group. For each room, we first group the sites belonging to that room into groups of adjacent sites. If multiple groups are present, that is, a room is split into separated regions, we set the target position of the site \(\mathbb{t}_i\) as the nearest site to that group. Implementation of loss functions As I mentioned in the Introduction, to implement the loss functions above for the forward propagation I used Shapely and Pytorch as below. Total loss is defined as a weighted sum of the above losses, and then using it, the Voronoi diagram generates a floor plan. \[ \,\\ \begin{align*} \mathcal{S}^{*} &= \arg \min_{\mathcal{S}} \mathcal{L}(\mathcal{S}, \mathcal{V}(\mathcal{S})) \\ \mathcal{L} &= \mathcal{L}_{\text{wall}} + \mathcal{L}_{\text{area}} + \mathcal{L}_{\text{fix}} + \mathcal{L}_{\text{topo}} + \mathcal{L}_{\text{Lloyd}} \end{align*} \,\\ \] class FloorPlanLoss(torch. autograd. Function): @staticmethod def compute_wall_loss(rooms_group: List[List[geometry. Polygon]], w_wall: float = 1. 0): loss_wall = 0. 0 for room_group in rooms_group: room_union = ops. unary_union(room_group) if isinstance(room_union, geometry. MultiPolygon): room_union = list(room_union. geoms) else: room_union = [room_union] for room in room_union: t1 = torch. tensor(room. exterior. coords[:-1]) t2 = torch. roll(t1, shifts=-1, dims=0) loss_wall += torch. abs(t1 - t2). sum(). item() for interior in room. interiors: t1 = torch. tensor(interior. coords[:-1]) t2 = torch. roll(t1, shifts=-1, dims=0) loss_wall += torch. abs(t1 - t2). sum(). item() loss_wall = torch. tensor(loss_wall) loss_wall *= w_wall return loss_wall @staticmethod def compute_area_loss( cells: List[geometry. Polygon], target_areas: List[float], room_indices: List[int], w_area: float = 1. 0, ): current_areas = [0. 0] * len(target_areas) for cell, room_index in zip(cells, room_indices): current_areas[room_index] += cell. area current_areas = torch. tensor(current_areas) target_areas = torch. tensor(target_areas) area_difference = torch. abs(current_areas - target_areas) loss_area = torch. sum(area_difference) loss_area **= 2 loss_area *= w_area return loss_area @staticmethod def compute_lloyd_loss(cells: List[geometry. Polygon], sites: torch. Tensor, w_lloyd: float = 1. 0): valids = [(site. tolist(), cell) for site, cell in zip(sites, cells) if not cell. is_empty] valid_centroids = torch. tensor([cell. centroid. coords[0] for _, cell in valids]) valid_sites = torch. tensor([site for site, _ in valids]) loss_lloyd = torch. norm(valid_centroids - valid_sites, dim=1). sum() loss_lloyd **= 2 loss_lloyd *= w_lloyd return loss_lloyd @staticmethod def compute_topology_loss(rooms_group: List[List[geometry. Polygon]], w_topo: float = 1. 0): loss_topo = 0. 0 for room_group in rooms_group: room_union = ops. unary_union(room_group) if isinstance(room_union, geometry. MultiPolygon): largest_room, *_ = sorted(room_union. geoms, key=lambda r: r. area, reverse=True) loss_topo += len(room_union. geoms) for room in room_group: if not room. intersects(largest_room) and not room. is_empty: loss_topo += largest_room. centroid. distance(room) loss_topo = torch. tensor(loss_topo) loss_topo **= 2 loss_topo *= w_topo return loss_topo ( . . . ) @staticmethod def forward( ctx: FunctionCtx, sites: torch. Tensor, boundary_polygon: geometry. Polygon, target_areas: List[float], room_indices: List[int], w_wall: float, w_area: float, w_lloyd: float, w_topo: float, w_bb: float, w_cell: float, save: bool = True, ) -> torch. Tensor: cells = [] walls = [] sites_multipoint = geometry. MultiPoint([tuple(point) for point in sites. detach(). numpy()]) raw_cells = list(shapely. voronoi_polygons(sites_multipoint, extend_to=boundary_polygon). geoms) for cell in raw_cells: intersected_cell = cell. intersection(boundary_polygon) intersected_cell_iter = [intersected_cell] if isinstance(intersected_cell, geometry. MultiPolygon): intersected_cell_iter = list(intersected_cell. geoms) for intersected_cell in intersected_cell_iter: exterior_coords = torch. tensor(intersected_cell. exterior. coords[:-1]) exterior_coords_shifted = torch. roll(exterior_coords, shifts=-1, dims=0) walls. extend((exterior_coords - exterior_coords_shifted). tolist()) cells. append(intersected_cell) cells_sorted = [] raw_cells_sorted = [] for site_point in sites_multipoint. geoms: for ci, (cell, raw_cell) in enumerate(zip(cells, raw_cells)): if raw_cell. contains(site_point): cells_sorted. append(cell) cells. pop(ci) raw_cells_sorted. append(raw_cell) raw_cells. pop(ci) break rooms_group = [[] for _ in torch. tensor(room_indices). unique()] for cell, room_index in zip(cells_sorted, room_indices): rooms_group[room_index]. append(cell) loss_wall = torch. tensor(0. 0) if w_wall > 0: loss_wall = FloorPlanLoss. compute_wall_loss(rooms_group, w_wall=w_wall) loss_area = torch. tensor(0. 0) if w_area > 0: loss_area = FloorPlanLoss. compute_area_loss(cells_sorted, target_areas, room_indices, w_area=w_area) loss_lloyd = torch. tensor(0. 0) if w_lloyd > 0: loss_lloyd = FloorPlanLoss. compute_lloyd_loss(cells_sorted, sites, w_lloyd=w_lloyd) loss_topo = torch. tensor(0. 0) if w_topo > 0: loss_topo = FloorPlanLoss. compute_topology_loss(rooms_group, w_topo=w_topo) loss_bb = torch. tensor(0. 0) if w_bb > 0: loss_bb = FloorPlanLoss. compute_bb_loss(rooms_group, w_bb=w_bb) loss_cell_area = torch. tensor(0. 0) if w_cell > 0: loss_cell_area = FloorPlanLoss. compute_cell_area_loss(cells_sorted, w_cell=w_cell) if save: ctx. save_for_backward(sites) ctx. room_indices = room_indices ctx. target_areas = target_areas ctx. boundary_polygon = boundary_polygon ctx. w_wall = w_wall ctx. w_area = w_area ctx. w_lloyd = w_lloyd ctx. w_topo = w_topo ctx. w_bb = w_bb ctx. w_cell = w_cell loss = loss_wall + loss_area + loss_lloyd + loss_topo + loss_bb + loss_cell_area return loss, [loss_wall, loss_area, loss_lloyd, loss_topo, loss_bb, loss_cell_area] Since I tried to intuitively convert the loss functions to Python codes with Shapely, there are some differences compared to the original. Backward with numerical differentiation Using numerical differentiation is not efficient in terms of computational performance. This is because it requires multiple function evaluations at nearby points to approximate derivatives. As you can see in the backward method, computational performance is influenced by the number of given sites. Therefore, I used Python's built-in multiprocessing module to improve the performance of backward propagation. @staticmethod def _backward_one(args): ( sites, i, j, epsilon, boundary_polygon, target_areas, room_indices, w_wall, w_area, w_lloyd, w_topo, w_bb, w_cell, ) = args perturbed_sites_pos = sites. clone() perturbed_sites_neg = sites. clone() perturbed_sites_pos[i, j] += epsilon perturbed_sites_neg[i, j] -= epsilon loss_pos, _ = FloorPlanLoss. forward( None, perturbed_sites_pos, boundary_polygon, target_areas, room_indices, w_wall, w_area, w_lloyd, w_topo, w_bb, w_cell, save=False, ) loss_neg, _ = FloorPlanLoss. forward( None, perturbed_sites_neg, boundary_polygon, target_areas, room_indices, w_wall, w_area, w_lloyd, w_topo, w_bb, w_cell, save=False, ) return i, j, (loss_pos - loss_neg) / (2 * epsilon) @runtime_calculator @staticmethod def backward(ctx: FunctionCtx, _: torch. Tensor, __): sites = ctx. saved_tensors[0] room_indices = ctx. room_indices target_areas = ctx. target_areas boundary_polygon = ctx. boundary_polygon w_wall = ctx. w_wall w_area = ctx. w_area w_lloyd = ctx. w_lloyd w_topo = ctx. w_topo w_bb = ctx. w_bb w_cell = ctx. w_cell epsilon = 1e-6 grads = torch. zeros_like(sites) multiprocessing_args = [ ( sites, i, j, epsilon, boundary_polygon, target_areas, room_indices, w_wall, w_area, w_lloyd, w_topo, w_bb, w_cell, ) for i in range(sites. size(0)) for j in range(sites. size(1)) ] with multiprocessing. Pool(processes=multiprocessing. cpu_count()) as pool: results = pool. map(FloorPlanLoss. _backward_one, multiprocessing_args) for i, j, grad in results: grads[i, j] = grad return grads, None, None, None, None, None, None, None, None, None, None Initializing parameters In optimization problems, the initial parameters significantly affect the final results. Firstly, I initialized the Voronoi diagram's sites such that the sites were generated at the center of a given floor plan: Random Sites Generation: Generate initial random sites using uniform distribution. Moving to Center of Boundary: Shift all sites so they are centered within the floor plan boundary. Outside Sites Adjustment: Adjust any sites that fall outside the boundary by moving them inward. Voronoi Diagram: Generate Voronoi diagram using sites. Process of parameters initialization Secondly, I used the KMeans clustering algorithm to assign cell indices per each site. Distance-based KMeans algorithm groups sites based on their spatial proximity, which helps ensure that rooms are formed from adjacent cells. By pre-clustering the sites, I created initial room assignments that are already spatially coherent, reducing the possibility of disconnected room regions during optimization. Using this approach, the optimizer converges more stably. Let me give an example: Floor plan generation on 300 iterations From the left, optimization without KMeans · optimization with KMeans As you can see in the figure above, KMeans makes the loss flow more smoothly and converge faster. Without KMeans, the optimization process shows erratic behavior with disconnected rooms. In contrast, when using KMeans for initial room assignments, the optimization maintains spatial coherence throughout the process, leading to: Faster convergence to the target room areas More stable wall alignments Reduced possibility of rooms splitting into disconnected regions This improvement in optimization stability is particularly important for complex floor plans with multiple rooms and specific area requirements. Experiments Finally, I'll conclude this article by attaching the experimental results optimized by 800 iterations. The boundaries for experiments are used in the original paper and repository. Please refer to this project repository for the entire code. Future works Set entrance: In the paper, to set entrance of the plan it uses \(\mathcal{L}_{\text{fix}}\) loss function. Graph-based contraint: In the paper, to set and ensure the rooms' adjacencies it uses graph-based constraint. Improve computational performance: Optimize the code to run faster (converting used language, or implementing differentiable voronoi diagram). Handle deadspaces: Set the loss function for deadspace \(\mathcal{L}_{\text{deadspace}}\) to exclude infeasible plans. Following boundary axis: Align walls to follow the axis of a given boundary instead of global X, Y (Replacing \(\mathcal{L}_{\text{wall}}\)). . .


The Potential for Architectural Space Variation Through the WFC Algorithm

The Potential for Architectural Space Variation Through the WFC Algorithm

The Potential for Architectural Space Variation Through the WFC Algorithm In contemporary architecture, there is a growing trend to creatively reinterpret and vary existing architectural vocabularies and styles, rather than simply “copying and pasting” them.

One such approach involves using the ‘Wave Function Collapse (WFC)’ algorithm. This study introduces the WFC algorithm into the field of architecture to explore how to automatically generate variations of spatial patterns. Using Dutch architect Aldo van Eyck’s “Sonsbeek Sculpture Pavilion” as a prime example, we trained a machine to learn the building’s spatial vocabulary and generated various alternative floor plans. We then discuss the significance and limitations of this algorithmic approach as revealed through the process. Understanding the Principles of the WFC Algorithm The WFC (Wave Function Collapse) algorithm is a procedural generation technique often used in game development and computer graphics. Its key feature is that it takes a single example pattern as input and generates outputs that resemble that pattern. In architectural terms, if we provide a building’s floor plan pattern or spatial rules as an “example,” the computer can learn it and then automatically produce new floor plan options that have a similar “look and feel. ” Superposition & Collapse In the initial stage, every cell is in an uncertain, “superposed” state in which it could be anything. Then, the algorithm selects the cell with the lowest uncertainty (fewest possible choices) and assigns it a specific tile (or spatial module) — a process known as “collapse. ” Constraint Propagation Once a cell’s state is decided, it affects its neighboring cells by imposing rules like, “Because I am in this state, you cannot be in that state. ” This propagation repeats until the overall arrangement is free of contradictions. Backtracking Occasionally, the algorithm encounters a dead end, where no module can fit into a certain cell. At that point, it backtracks a few steps to try a different option. Viewed through an architectural lens, WFC is like assembling “puzzle pieces” one by one and backtracking whenever there is a conflict, until the whole arrangement is complete. Aldo van Eyck Amsterdam Orphanage / Aldo van Eyck Aldo van Eyck (1918–1999) was a major figure in mid-20th-century Dutch architecture. He emphasized human-centered design and lively social interaction within space. Often associated with the ‘Structuralism’ movement that emerged after CIAM, he viewed architecture as a stage for human life, striving to incorporate everyday psychological and social dimensions into design. For example, in designing an orphanage, he placed small courtyards, corridors, and open spaces throughout the building instead of lining up housing units in a strict row, effectively creating a “miniature city. ” He sought to connect the vast outside city with the building’s interior in a seamless way, bringing coziness of a home into the city, and continuity of the city into each housing unit. Beyond clean modernist design, van Eyck aimed to capture the rhythm and flow of spaces where people naturally come together. The Sonsbeek Sculpture Pavilion: Spatial Elements and Patterns Sonsbeek Sculpture Pavilion / Aldo van Eyck One of his most notable works is the Sonsbeek Sculpture Pavilion, built in 1966 as a temporary exhibition space for a sculpture show in a park in Arnhem. Externally, it appears enclosed, but upon entering, one discovers a complex, diverse exhibition layout. In short, it’s like “a small city placed in the middle of a park. ” The pavilion is composed of multiple parallel walls, interwoven with semi-circular pocket spaces and corridors in between. Its floor plan is generated by a process of repetition and variation, creating a distinctive experience of privacy and invitation within the natural setting. In this study, we extracted the pavilion’s walls, semi-circular modules, and corridors as puzzle pieces, and defined how the modules (walls + curved segments + corridors) could connect as a set of rules (constraints). When these rules are given to the WFC algorithm, it can generate new floor plan variations that retain the essence of the original pavilion. Analysis of the Sonsbeek Sculpture Pavilion Applying the WFC Algorithm: Generating Floor Plan Variations Information stored in each module Connection example 50 sample modules We applied WFC in earnest by first simplifying the Sonsbeek Pavilion into a grid, and defining 50 or so possible modules (walls, pockets, corridors, etc. ) that fit into that grid. Rules such as “Only corridors can be placed adjacent to walls” or “Pocket modules can only be placed at certain intervals” function as constraints in WFC. 2-module 4-module 6-module 12-module The algorithm begins by assigning all possible modules to each cell. Then, it selects the easiest cell to finalize (the one with the lowest entropy) and fixes that cell’s module. Based on this decision, the algorithm updates the possibilities of adjacent cells (constraint propagation), then selects another cell to finalize, and so on. Repeating this process, the floor plan grows in a manner reminiscent of how van Eyck combined parallel walls and curved elements. Process of WFC algorithm Aldo van Eyck mutation with WFC As a result, we obtained numerous variations that look quite similar to the original but still differ in various ways. Some have additional pocket spaces, making them more maze-like; others have fewer corridors, resulting in a more streamlined layout. However, even with multiple backtracking attempts, some outputs inevitably failed by producing disconnected circulation or redundant walls. This underscores the importance of refining “rule design,” which in practical architecture might mean more precise constraints like “Do not place a kitchen directly next to a bathroom. ” Overall process Significance and Limitations of the Algorithmic Approach As demonstrated, the WFC algorithm is well suited to generating diverse floor plan variations from a single architectural pattern (e. g. , the Sonsbeek Pavilion). From a designer’s perspective, this is advantageous because one can get multiple variations quickly without manually drawing each one. The algorithm also preserves much of the aesthetic and spatial logic of the input example, opening up possibilities for reusing “an existing architect’s design language” in new contexts. However, architecture involves a complex weave of qualitative factors and functions, so relying solely on numeric rules and modules has inherent limits. For instance, “the psychological sense of distance between people in this space” or “the building’s historical and urban context” cannot be fully captured by simple module-placement rules. Ultimately, a human designer must review the algorithm’s output and judge: “This is good, but that layout is too winding for an actual exhibition route,” etc. Moreover, practical concerns such as structural safety, daylight, and soundproofing are not automatically resolved by the algorithm. Thus, WFC is most useful as a conceptual sketch tool or form-finding experiment rather than a comprehensive design solution. Conclusion It is quite intriguing that the WFC algorithm can, to some extent, mimic the human-centered repetition and variation design approach employed by Aldo van Eyck. In other words, starting from a single architectural “grammar” (or vocabulary), it is possible to automatically generate multiple new floor plan scenarios. For instance, if one trains the algorithm using a “favorite housing project by Architect A,” the algorithm could then produce new apartment layouts that vary the original style while retaining its core qualities. Designers, in turn, might discover unexpected configurations that they themselves would not have imagined. Merging WFC with van Eyck’s design approach demonstrates how case-based generative algorithms could be used in future architectural design processes. Potential applications include feeding a prominent architect’s portfolio into the system to generate new ideas for upcoming projects, or training on historical precedents to propose new designs. Beyond individual architects, we might also teach the algorithm the universal patterns of specific building types, thereby accelerating the accumulation and reinvention of architectural knowledge. Example Outputs References The Wavefunction Collapse Algorithm explained very clearly | Robert Heaton Infinite procedurally generated city with the Wave Function Collapse algorithm | Marian’s Blog Aldo van Eyck pavilion – Kröller-Müller Museum Labyrinth and Life - Luis Fernández-Galiano | Arquitectura Viva Aldo van Eyck > Sculpture Pavilion, Sonsbeek Exhibition | HIC Wave Function Collapse & Plan Adjacencies — Sam Aston . .


Polygon Segmentation

Polygon Segmentation

Objective In the early stages of architectural design, there is a concept about the axis in which direction the building will be placed. This plays an important role in determining the optimal layout considering the functionality, aesthetics, and environmental conditions of the building. The goal of this project is to develop a segmenter that can determine how many axes a given 2D polygon should be segmented into and how to make those segmentations.

    Segmented lands by a human architect The figures above are imaginary dividing lines arbitrarily set by the architect. The apartments in the figures are placed based on segmented polygons. Human architects intuitively know how many axes a given 2D polygon should be segmented into and how to create these segmentations, but explaining this intuition to a computer is difficult. To achieve this, I will use a combination of deep learning and graph theory. In the graph, each point of the polygon will be a node and the connections between points will be edges. Based on this concept, I will implement a GNN-based model, which will learn how to optimally segment given polygons. A simple understanding of graph and GNN Before generating data, let's understand graphs and Graph Neural Networks. Basically, the graph is defined \( G = (V, E) \). At this expression, \( V \) is the set of vertices (nodes) and \( E \) is the set of edges. Graphs are mainly expressed as an adjacency matrix. When the number of points is \( n \), the size of the adjacency matrix \( A \) is \( n \times n \). When dealing with a graph in machine learning, it is expressed as a feature matrix depicting the characteristics of points. When the number of features is \( f \), the dimension of the feature matrix \( X \) is \( n \times f \). Understanding of the graph expression In this figure, \( n = 4 \), \( f = 3 \) \( A = n \times n = 4 \times 4 \) \( X = n \times f = 4 \times 3 \). Graphs are used in various fields to represent data, and they are useful when teaching geometry to deep learning models for the following reasons: Representing Complex Structures Simply: Graphs effectively represent complex structures and relationships, making them suitable for modeling various forms of geometric data. Flexibility: Graphs can accommodate varying numbers of nodes and edges, making them advantageous when dealing with objects of diverse shapes and sizes. Graph Neural Networks (GNN) is a type of neural network designed to operate on graph structures. Unlike traditional neural networks, which work on fixed-size inputs like vectors or matrices, GNNs can handle graph-structured data with variable size and connectivity. This makes GNNs particularly suitable for tasks where relationships are important, such as social networks, geometries, etc. It primarily use connections and the states of neighboring nodes to update (learn) the state of each node (Message Passing). Predictions are then made based on the final states. This final state is generally referred to as the node embedding (or I think also encoding is right because raw features of nodes are changed to other representations). There are various methods for Message Passing, but since this task will be dealing with geometry, let's focus on models based on Spatial Convolutionan Network. This method is known to be suitable for representing data with important geometric and spatial structures. GNNs using Spatial Convolutional Network enable each node to integrate information from neighboring nodes, allowing for a more accurate understanding of local characteristics. Through this, the model can better comprehend the complex shapes and features of geometry. Convolution operations From the left, 2D Convolution · Graph Convolution The idea of a Spatial Convolutional Network (SCN) is similar to that of a Convolutional Neural Network (CNN). An image can be transformed into a grid-shaped graph. CNN processes images by using filters to combine the surrounding pixels of a central pixel. SCN extends this idea to graph structures by combining the features of neighboring connected nodes instead of neighboring pixels. Specifically, CNNs are useful for processing images in a regular grid structure, where the filter considers the surrounding area of each pixel to extract features. In contrast, SCNs operate on general graph structures, combining the features of each node with those of its connected neighbors to generate embeddings. Data preparation Since I have briefly looked into graphs and GNNs in the above, let's now prepare the data! There are already many raw polygons around us, and that is the land. First, let's collect raw polygons from vworld. Below is a part of all the raw polygons I collected from vworld. Somewhere in Seoul. shp Now that we have collected the raw polygons, let's define the characteristics of the polygons to be included in the feature matrix. They are as follows: X coordinate (normalized) Y coordinate (normalized) Inner angle of each vertex (normalized) Incoming edge length (normalized) Outgoing edge length (normalized) Concavity or convexity of each vertex Minimum rotated rectangle ratio Minimum rotated rectangle aspect ratio Polygon area Then, I need to convert these data into graph form. Here is an example of hand labeling: Land polygon in Gangbukgu, Seoul From the left, raw polygon & labeled link · adjacency matrix · feature matrix However since it is impossible to label countless raw polygons by hand, I need to generate this data automatically. It would be nice if it could be fully automated with an algorithm, but if that were possible, there wouldn't be a need for deep learning 🤔. So, I created a naive algorithm that can reduce manual work even a little. This algorithm is inspired by the triangulations of the polygon. The process of this algorithm is as follows: Triangulate a polygon: Retrieve the triangulation edges. Generate Splitter Combinations: Generate all possible combinations of splitters for iterating over them. Iterate Over Combinations: Iterate over all possible combinations of splitters to find the best segmented polygon. Check Split Validity: Check if all segmentations are valid. Compute Scores and evaluation: Compute scores to store the best segmentations. The scores (even_area_score, ombr_ratio_score, slope_similarity_score) used in the fifth step are computed as follows, and each score is aggregated as a weighted sum to obtain the segmentations with the lowest score. \[ even\_area\_score = \frac{(A_1 - \sum_{i=2}^{n} A_i)}{A_{polygon}} \times {w_1} \] \[ ombr\_ratio\_score = |(1 - \frac{A_{split1}}{A_{ombr1}}) - \sum_{i=2}^{n} (1 - \frac{A_{spliti}}{A_{ombri}})| \times {w_2} \] \[ avg\_slope\_similarity_i = \frac{\sum_{j=1}^{k_i} |\text{slope}_j - \text{slope}_{\text{main}}|}{k_i} \] \[ slope\_similarity\_score = \frac{\sum_{i=1}^{n} avg\_slope\_similarity_i}{n} \times {w_3} \] \[ score = even\_area\_score + ombr\_ratio\_score + slope\_similarity\_score \] The whole code for this algorithm can be found here, and the key part of the algorithm is as follows. def segment_polygon( self, polygon: Polygon, number_to_split: int, segment_threshold_length: float, even_area_weight: float, ombr_ratio_weight: float, slope_similarity_weight: float, return_splitters_only: bool = True, ): """Segment a given polygon Args: polygon (Polygon): polygon to segment number_to_split (int): number of splits to segment segment_threshold_length (float): segment threshold length even_area_weight (float): even area weight ombr_ratio_weight (float): ombr ratio weight slope_similarity_weight (float): slope similarity weight return_splitters_only (bool, optional): return splitters only. Defaults to True. Returns: Tuple[List[Polygon], List[LineString], List[LineString]]: splits, triangulations edges, splitters """ _, triangulations_edges = self. triangulate_polygon( polygon=polygon, segment_threshold_length=segment_threshold_length, ) splitters_selceted = None splits_selected = None splits_score = None for splitters in list(itertools. combinations(triangulations_edges, number_to_split - 1)): exterior_with_splitters = ops. unary_union(list(splitters) + self. explode_polygon(polygon)) exterior_with_splitters = shapely. set_precision( exterior_with_splitters, self. TOLERANCE_LARGE, mode="valid_output" ) exterior_with_splitters = ops. unary_union(exterior_with_splitters) splits = list(ops. polygonize(exterior_with_splitters)) if len(splits) != number_to_split: continue if any(split. area score_sum: splits_score = score_sum splits_selected = splits splitters_selceted = splitters if return_splitters_only: return splitters_selceted return splits_selected, triangulations_edges, splitters_selceted However, this algorithm is not perfect, and there are some problems. Because this algorithm uses the weights for computing scores, it may be sensitive to them. Look at the below figures. The first is good case, and the second is bad case. Results of the algorithm From the left, triangulations · segmentations · oriented bounding boxes for segmentations Since these problems cannot be handled by my naive algorithm, I first used the algorithm to process the raw polygon data and then labeled it manually like the following. So now I have approximately 40000 data with augmented originals. The all dataset can be found here. Some data for training Model for link prediction Now, let's create a graph model using Pytorch Geometric and teach the graph data. Pytorch Geometric is a library based on PyTorch to easily write and train graph neural networks. The role I need to assign to the model is to generate lines that will segment polygons. This can be translated into a task primarily used in GNNs, known as link prediction. Link prediction models usually use an encoder-decoder structure. The encoder creates node embeddings, which are vector representations of the nodes that extract their features. The decoder then uses these embeddings to predict the probability that a pair of nodes is connected. When inference, the model inputs all possible node pairs into the decoder. It then calculates the probability of each pair being connected. Only pairs with probabilities above a certain threshold are kept, indicating likely connections. Generally, a simple operation like the dot product is used to predict links based on the similarity of node pairs. However, I thought this approach was not suitable for tasks using geometric data, so I additionally trained a decoder. Below are the encode and decode methods of the model. The complete model code can be found here. class PolygonSegmenter(nn. Module): def __init__( self, conv_type: str, in_channels: int, hidden_channels: int, out_channels: int, encoder_activation: nn. Module, decoder_activation: nn. Module, predictor_activation: nn. Module, use_skip_connection: bool = Configuration. USE_SKIP_CONNECTION, ): ( . . . ) def encode(self, data: Batch) -> torch. Tensor: """Encode the features of polygon graphs Args: data (Batch): graph batch Returns: torch. Tensor: Encoded features """ encoded = self. encoder(data. x, data. edge_index, edge_weight=data. edge_weight) return encoded def decode(self, data: Batch, encoded: torch. Tensor, edge_label_index: torch. Tensor) -> torch. Tensor: """Decode the encoded features of the nodes to predict whether the edges are connected Args: data (Batch): graph batch encoded (torch. Tensor): Encoded features edge_label_index (torch. Tensor): indices labels Returns: torch. Tensor: whether the edges are connected """ # Merge raw features and encoded features to inject geometric informations if self. use_skip_connection: encoded = torch. cat([data. x, encoded], dim=1) decoded = self. decoder(torch. cat([encoded[edge_label_index[0]], encoded[edge_label_index[1]]], dim=1)). squeeze() return decoded The encoding process transforms the original feature matrix \( F \) into a new matrix \( E \) with potentially different dimensions. The following is an expression of the feature matrix before and after the encode function. \( n \) as the number of nodes \( m \) as the number of features \( c \) as the number of channels \( F \in \mathbb{R}^{n \times m} \) as the feature matrix before the encode function \( E \in \mathbb{R}^{n \times c} \) as the feature matrix after the encode function \[ F = \begin{bmatrix} f_{11} & f_{12} & \cdots & f_{1m} \\ f_{21} & f_{22} & \cdots & f_{2m} \\ \vdots & \vdots & \ddots & \vdots \\ f_{n1} & f_{n2} & \cdots & f_{nm} \end{bmatrix} \] \[ E = \begin{bmatrix} e_{11} & e_{12} & \cdots & e_{1c} \\ e_{21} & e_{22} & \cdots & e_{2c} \\ \vdots & \vdots & \ddots & \vdots \\ e_{n1} & e_{n2} & \cdots & e_{nc} \end{bmatrix} \] In the decode method, the encoded features of the nodes and raw features are used to predict the connections (links) between them. This skip connection merges raw features and encoded features to inject geometric information. Using skip connections helps preserve original features, and enhances overall model performance by combining low-level and high-level information. In the feedforward process, the decoder is trained with connected labels and labels that should not be connected. This is a technique called negative sampling, which is used to improve model performance in link prediction tasks. By providing examples of what should not be linked, negative sampling helps the model better distinguish between actual links and non-links, leading to improved accuracy in predicting future or missing links. In most networks, actual links are significantly fewer than non-links, which can bias the model towards predicting no link. Negative sampling allows for controlled selection of negative examples, balancing the training data and enhancing the learning process. def forward(self, data: Batch) -> Tuple[torch. Tensor, torch. Tensor, torch. Tensor]: """Forward method of the models, segmenter and predictor Args: data (Batch): graph batch Returns: Tuple[torch. Tensor, torch. Tensor, torch. Tensor]: whether the edges are connected, predicted k and target k """ # Encode the features of polygon graphs encoded = self. encode(data) ( . . . ) # Sample negative edges negative_edge_index = negative_sampling( edge_index=data. edge_label_index_only, num_nodes=data. num_nodes, num_neg_samples=int(data. edge_label_index_only. shape[1] * Configuration. NEGATIVE_SAMPLE_MULTIPLIER), method="sparse", ) # Decode the encoded features of the nodes to predict whether the edges are connected decoded = self. decode(data, encoded, torch. hstack([data. edge_label_index_only, negative_edge_index]). int()) return decoded, k_predictions, k_targets Up to this point, I defined the encoder and decoder. Since the encoder and decoder use nodes in batches, it seems that they cannot recognize each graph separately. Because I wanted to train the model on how many segmentations to divide the graph into when a graph is input, I defined a predictor to train k separately within the segmenter class. The segmenter generates the segmentations using topk through the encoder and decoder processes described above. It then sorts the generated links in order of connection strength, and the predictor decides how many links to use. Inference process From the top, topk segmentations · segmentation selected by predictor Training and evaluating It's time to train the model. The model has been trained for 500 epochs, during which both the training loss and validation loss were recorded to monitor the training progress and convergence. As shown in the plots: According to the top plots, both the training loss and validation loss decrease quickly initially and then generalize. All evaluation metrics (accuracy, F1 score, recall, and AUROC) show rapid improvement initially and then stabilize, having good generalization. Losses and metrics for 500 epochs All the metrics look good, but there is a question about whether these metrics can be trusted 100%. This may be due to the impact of negative sampling. Based on the visualization of some of the test data, it is evident that the model accurately predicts polygons that do not require segmenting. This suggests that the model may be overfitting on negative samples. High performance on negative samples might not accurately reflect the model's ability to identify positive cases correctly. Evaluate some test data qualitatively Taking inspiration from IoU loss, therefore I defined GeometricLoss to evaluate segmentation quality and create a reliable evaluation metric. The GeometricLoss aims to quantify the discrepancy between predicted and ground-truth geometric structures within a graph. The process of the geometric loss calculation is as follows: Polygon Creation from Node Features: Each graph's node features, representing coordinates, are converted into a polygon. Connecting Predicted Edges: These edges represent the predicted segmentation of the polygon. Connecting Ground-Truth Edges: These edges represent the correct segmentation of the polygon. Creating Buffered Unions: The predicted and ground-truth edges are combined and then buffered. Calculating Intersection: This intersection represents the common area between the predicted and true geometric structures. Computing Geometric Loss: The intersection area is normalized by the ground-truth area and then negated An example for the geometric loss calculation From the left, loss: -0. 000478528 · loss: -0. 999999994 The geometric loss serves solely as an evaluation metric for the model. Therefore, it has not been added to the BCE loss. This is because the model's training batches are based on nodes rather than graphs. Hence, calculating this loss for every graph per epoch would significantly slow down the process. Therefore, I have calculated this loss only for 4 samples per batch to use it exclusively as a model evaluation metric. The results are as follows: Geometric losses for 500 epochs From the left, train geometric loss · validation geometric loss The GeometricLoss class is defined as follows and the code can be found here. Limitations and future works GNNs are node embedding-based models, so they seem to recognize the characteristics of individual nodes rather than the overall shape of the graph. While GNNs have a good ability to generalize shapes during training, it has been challenging to overfit them accurately to the intended labels. Comparison with CNNs: CNNs excel at capturing local patterns and combining them into more abstract representations, which can be advantageous for tasks involving polygons. Imbalanced labels for predictor: The predictor is trained to select one of 0, 1, or 2, but the number of data for 2 is very few. Exploring reliable metrics: Explore metrics that are more suitable for geometric tasks. Overfitting: Studying how positive labels can be completely overfitted. References https://pytorch-geometric. readthedocs. io/en/latest/ https://en. wikipedia. org/wiki/Graph_theory/ https://mathworld. wolfram. com/AdjacencyMatrix. html/ https://process-mining. tistory. com/164 https://blog. naver. com/gyrbsdl18/222556439520 https://medium. com/watcha/gnn-%EC%86%8C%EA%B0%9C-%EA%B8%B0%EC%B4%88%EB%B6%80%ED%84%B0-%EB%85%BC%EB%AC%B8%EA%B9%8C%EC%A7%80-96567b783479/ . .


Spacewalk × Boundless

Spacewalk × Boundless

Boundless Boundless, a subsidiary of Spacewalk, true to its name, pursues innovation by breaking down boundaries through collaboration among experts from various fields.

Landbook, Korea’s first AI architectural design service, was born in Boundless’s in-house technology research institute, presenting new possibilities that encompass both technology and design. Additionally, Boundless participated as a major early investor in Green Lamp Library and Impact Wave, and recently, they are establishing a process that integrates design and construction, going beyond the traditional scope of construction management that emphasizes budget and timeline. Projects designed by Boundless Finding Wave Point’s Balance with Parametric Design ‘Wave Point’ features a design characterized by curved lines and layered forms inspired by Jeju Island’s coastline, reflecting the beauty of the marine environment. By adding the concept of ‘Point’ to the impressive exterior evoked by the name ‘Wave’, it emphasizes that this place will function as a community center beyond just being a transportation hub. ‘Wave Point: Socar Station Jeju’ is a modern and dynamic reinterpretation of Jeju’s nature, and will establish itself as a starting point for relaxed and worry-free travel. Objectives and Requirements The Wave Point: Socar Station Jeju project is currently undergoing area adjustments to meet construction costs within budget, and during this process, both architects and clients are seeking to find the optimal design balance. Design adjustments are made through various parameters, and currently, architects are spending considerable time creating a few alternatives and finding the optimal solution among them. By providing parameter-based design tools to Boundless architects, the Spacewalk technical team enables architects to make more rational decisions and better convince clients. This project aims to innovate the design process by leveraging the advantages of parametric modeling. Various design variations can be attempted by setting variables such as the radius of circular floor plans for each level, center points, floor heights, and facade curvature, which are core elements of the building. This approach offers several important advantages compared to traditional manual design methods. First, the design exploration process becomes significantly more efficient. Architects can generate and review various design options in real-time, leading to faster creative solutions. Second, more accurate decision-making is possible as changes in building area, structure, and appearance can be instantly verified during design modifications. Third, more effective communication is enabled during client consultations as various alternatives can be presented and compared in real-time. In particular, this project will set the second floor slab size, column positions, and core location and volume as fixed elements to ensure structural stability and functionality, while seeking optimal design solutions through flexible adjustment of other elements. This parametric approach is expected to greatly help in finding the balance between artistry, functionality, and economic efficiency. Parameters and Constraint The parametric design of the Wave Point project systematically controls key variables that determine the form and function of the building. These variables were established considering the building’s aesthetic value, practicality, and economic efficiency. Each parameter is interrelated, and through their combination, optimal design solutions can be derived. The parametric design system is largely divided into fixed elements and variable elements. Fixed elements ensure structural stability and basic functionality, while variable elements provide design flexibility and optimization possibilities. Through systematic adjustment of these elements, optimal designs can be found within budget constraints. Parameters Radius of circles per floor: A core element that determines the floor plan size of each level, directly affecting usable area and exterior form. Center point: Determines the position of each floor plan and creates overall mass movement and dynamism. Floor height per level: Determines the practical utilization of space and the building’s proportional appearance from the exterior. Tapered surface curvature (aspect ratio): Forms the building envelope’s slope and curved surfaces, completing the overall sculptural beauty. Parameters scanning Constraint Structural constraint: Second floor slab (38m diameter), column positions, core location and volume Structural constraint 3D Parameteric Model View Wave Point’s parametric model is composed of an integrated system where design elements are organically connected. This model includes all previously explained parameters and constraints, allowing real-time observation of how changes in each element affect the overall design. Through this integrated approach, we can instantly analyze the impacts of design modifications and derive optimal solutions. Socar Station Jeju Parametric Model Variation designed by Parametric System Through the parametric design system, various design variations can be explored and evaluated. Each variation is generated through combinations of previously defined parameters and can be evaluated from the following perspectives: Aesthetic value of envelope design Structural stability Construction feasibility Economic efficiency Through this multi-faceted evaluation, we can derive optimal design solutions that satisfy both project goals and constraints. The parametric system can quickly generate and analyze hundreds of design variations, helping architects make better decisions. . .


Infinite synthesis

Infinite synthesis

Introduction Deep Generative AI, a field of artificial intelligence that focuses on generating new data similar to training data, is having an impact not only in text generation and image generation but also 3d model generation in the design industry. In the realm of architectural design, especially in the phase of initial design, generative AI can serve as a useful tool for examining many design options.

  Physical model examination by humans Shou Sugi Ban / BYTR Architects Leveraging the Deep Signed Distance Functions model (DeepSDF) with latent vectors, this project aims to build the algorithm that can synthesize infinite number of skyscrapers simliar to trained data. These vectors, mapped within a high-dimensional latent space, serve as the DNA for synthesizing potential skyscrapers. So, through the system that manipulates the latent vectors for expressing the shape of the buildings, architectural designers can rapidly generate and examine a diverse array of design options. By manipulating(interpolation, and arithmetic operations) between two or more latent vectors, the model gives us infinite design options virtually. This method not only provides a brand-new design process but also will lead us to explore the novel architectural forms previously unattainable through conventional design methodologies. Understanding Signed Distance Functions In wikipedia, Signed Distance Functions (SDFs) is defined as follows: In mathematics and its applications, the signed distance function (or oriented distance function) is the orthogonal distance of a given point x to the boundary of a set Ω in a metric space, with the sign determined by whether or not x is in the interior of Ω. The function has positive values at points x inside Ω, it decreases in value as x approaches the boundary of Ω where the signed distance function is zero, and it takes negative values outside of Ω. However, the alternative convention is also sometimes taken instead (i. e. , negative inside Ω and positive outside). SDF representation applied to the Stanford Bunny (a) Way to decide sign: If the point is on the surface, SDF = 0 (b) 2D cross-section of the signed distance field (c) Rendered 3D surface recovered from SDF = 0 At the core of SDFs lies their simplicity and power in describing complex geometries. Unlike traditional mesh representations, which rely on vertices, edges, and faces to define forms, by using SDFs, we can construct 3d mesh models with a continuous surface, and it just needs 3D grid-shaped XYZs and their corresponding SDF values. Let's take a look at the following example for the CCTV headquarters by OMA recovered from SDF = 0. Initially, to obtain SDF values, the entire space around the CCTV headquarters model is sampled on a regular grid (In this example, resolution indicates the number of grid points. i. e. , resolution 8 means 8x8x8 grid). At each grid point, the SDF provides a value that indicates the point's distance from the closest surface of the model. Inside the model, these values are negative (or positive, depending on the convention), and outside, they are positive (or negative). As you can see in the below figure, more grid points result in more detailed and accurate 3d models. The numbers of the grid points used in examples are respectively 8x8x8(=512), 16x16x16(=4096), 32x32x32(=32768), 64x64x64(=262144), 128x128x128(=2097152). To recover meshes using grid points and SDF values, it needs to use the Marching Cubes algorithm. Recovered CCTV headquarters from the SDFs top 3, original 3d model · resolution 8 · resolution 16 bottom 3, resolution 32 · resolution 64 · resolution 128 Because we need each sign that decides whether the point is inside or outside the model, the meshes we recover from SDF values must be watertight meshes that are fully closed. By using the below code, I examined the recovered meshes from the SDF values and each grid resolution. The check_watertight parameter is set to True, so the code checks if the mesh is watertight, if not fully closed, it will convert the mesh to the watertight mesh using pcu. mesh = DataCreatorHelper. load_mesh( path=r"deepSDF\data\raw-skyscrapers\cctv_headquarter. obj", normalize=True, map_z_to_y=True, check_watertight=True, translate_mode=DataCreatorHelper. CENTER_WITHOUT_Z ) for resolution in [8, 16, 32, 64, 128]: coords, grid_size_axis = ReconstructorHelper. get_volume_coords(resolution=resolution) sdf, *_ = pcu. signed_distance_to_mesh( np. array(coords. cpu(), dtype=np. float32), mesh. vertices. astype(np. float32, order='F'), mesh. faces. astype(np. int32) ) recovered_mesh = ReconstructorHelper. extract_mesh( grid_size_axis, torch. tensor(sdf), ) Data preparation and processing The first step for preparing data to train DeepSDF model is to gather the 3d models of the skyscrapers. I used the 3dwarehouse to download the free 3d models. I downloaded the following models in the below figure. From the left, CCTV headquarters · Mahanakhon · Hearst Tower · Bank of China · Empire State Building · Transamerica Pyramid · The Shard · Gherkin London · Taipei 101 · Shanghai World Financial Center · One World Trade Center · Lotte Tower · Kingdom Centre · China Zun · Burj Al Arab. The 15 raw data I gathered is in this link. Skyscrapers The next step includes (1) normalizing all data to fit within a regular grid volume, (2) converting them into a consistent format. (1) normalizing: Generally, when geometry data is used for learning, it is normalized to a value between 0 and 1 for each individual object, and normalized by moving the centroid of the model to the origin(0, 0). i. e. , the farthest point of the model is set to 1. If we use the normalization method used generally, it doesn't reflect the relative height of the skyscrapers. Therefore, in this project, the height of the highest model among all skyscrapers data is set to 1 and normalized. Normalized skyscrapers From the left, the lowest building (Gherkin London) · the highest building (One World Trade Center) (2) converting: The DeepSDF model's feed-forward networks have the following architecture. It is composed of 8 fully connected layers, denoted as "FC" on the diagram. As you can see in the below figure, the dimension of the input X excepting latent vectors consists of (x, y, z) 3. The feed-forward network for DeepSDF model The data sample \( X \) is composed of \( (x, y, z) \) and the corresponding label \( s \) like this: \( X := \{(x, s) : SDF(x) = s\} \) Additionally, class numbers are required to assign a latent vector to each sample. As mentioned in the introduction part, the latent vectors play the role of the DNA in representing the shape of the buildings. class SDFdataset(Dataset, Configuration): def __init__(self, data_path: str = Configuration. SAVE_DATA_PATH): self. sdf_dataset, self. cls_nums, self. cls_dict = self. _get_sdf_dataset(data_path=data_path) def __len__(self) -> int: return len(self. sdf_dataset) def __getitem__(self, index: int) -> Tuple[torch. Tensor]: xyz = self. sdf_dataset[index, :3] sdf = self. sdf_dataset[index, 3] cls = self. sdf_dataset[index, 4]. long() return xyz. to(self. DEVICE), sdf. to(self. DEVICE), cls. to(self. DEVICE) Implementing and training of DeepSDF model As can be seen in the above part of (2) converting, the feed-forward network of the DeepSDF model is simple as follows. class SDFdecoder(nn. Module, Configuration): def __init__(self, cls_nums: int, latent_size: int = Configuration. LATENT_SIZE): super(). __init__() self. main_1 = nn. Sequential( nn. Linear(latent_size + 3, 512), nn. ReLU(True), nn. Linear(512, 512), nn. ReLU(True), nn. Linear(512, 512), nn. ReLU(True), nn. Linear(512, 512), nn. ReLU(True), nn. Linear(512, 512), ) self. main_2 = nn. Sequential( nn. Linear(latent_size + 3 + 512, 512), nn. ReLU(True), nn. Linear(512, 512), nn. ReLU(True), nn. Linear(512, 512), nn. ReLU(True), nn. Linear(512, 1), nn. Tanh(), ) self. latent_codes = nn. Parameter(torch. FloatTensor(cls_nums, latent_size)) self. latent_codes. to(self. DEVICE) self. to(self. DEVICE) def forward(self, i, xyz, cxyz_1=None): if cxyz_1 is None: cxyz_1 = torch. cat((self. latent_codes[i], xyz), dim=1) x1 = self. main_1(cxyz_1) # skip connection cxyz_2 = torch. cat((x1, cxyz_1), dim=1) x2 = self. main_2(cxyz_2) return x2 The SDFdecoder class has the following arguments as inputs: cls_nums is the number of skyscrapers latent_size is the dimension of the latent vector In this project, the cls_nums and latent_size were used as 15 and 128, respectively. Therefore, the size of the instance variable initialized for the latent vector (self. latent_codes) is torch. Size([15, 128]). The skip connection technique used in forward propagation enables the model to learn complex functions representing the SDF by combining low-level information (XYZ coordinates) with the high-level features learned by the network. Now let's look at the learning process. It was trained for 150 epochs, and the total number of data (number of points) is 64x64x64x15 (=3932160). This is divided into an 8:2 ratio and used in the learning and evaluation process. It took an average of 1000 seconds per epoch. At the end of each epoch loop, I added code to reconstruct the skyscraper to qualitatively evaluate the model. Training process for 150 epochs From the top, reconstructed skyscraper · losses After training the model for 150 epochs, I reconstructed the 3D model by predicting the SDF value for each point in regular grid points for 15 skyscrapers with latent vectors. In the below figure, the buildings in the left row are reconstructed by the model, and the buildings in the right row are the original 3D models. Comparing reconstructed skyscrapers vs. originals I think the model is unable to reconstruct the original skyscrapers with precise details accurately, but it seems to generate skyscrapers that are similar to the original skyscrapers appropriately. In this task for reconstructing models, I used 384x384x384(=56623104) for the grid points resolution. Synthesizing skyscrapers infinitely Lastly, let me synthesize skyscrapers by interpolating them or using arithmetic operations. Through the following code, you can generate infinite data of different shapes, starting by synthesizing latent vectors for the initial 15 buildings. In this time, to synthesize them, I used 128x128x128(=2097152) grid points resolution. def infinite_synthesis( sdf_decoder: SDFdecoder, save_dir: str, synthesis_count: int = np. inf, resolution: int = 128, map_z_to_y: bool = True, check_watertight: bool = True, ): synthesizer = Synthesizer() synthesized_latent_codes_npz = "infinite_synthesized_latent_codes. npz" synthesized_latent_codes_path = os. path. join(save_dir, synthesized_latent_codes_npz) os. makedirs(save_dir, exist_ok=True) synthesized_latent_codes = { "data": [ { "name": i, "index": i, "synthesis_type": "initial", "latent_code": list(latent_code. detach(). cpu(). numpy()), } for i, latent_code in enumerate(sdf_decoder. latent_codes) ] } if os. path. exists(synthesized_latent_codes_path): synthesized_latent_codes = { "data": list(np. load(synthesized_latent_codes_path, allow_pickle=True)["synthesized_data"]) } while len(synthesized_latent_codes["data"]) From 15 skyscrapers to the 450 skyscrapers The first row that is illustrated as rectangles the initial 15 skyscrapers Tracking synthesized data Since the function used above for synthesizing skyscrapers records data, we can use the data to check the parents of the synthesized skyscrapers. The process involves tracking back from any given synthesized design to its origins using graph-based analysis. This helps us to understand how specific designs are derived and the influence of original models on synthesized outcomes. Therefore, to track the synthesized skyscrapers, I used the BFS. The figures below demonstrate the application of these functions, showing the trace and visualization of synthesized skyscrapers starting from initial designs through various synthesis steps, culminating in complex structures. This illustrates the complex relationships and dependencies within the synthesized skyscrapers. Tracking synthesized skyscrapers Limitations and future works While the project demonstrates the potential of Deep Generative AI in synthesizing skyscraper designs, it's not without its limitations. So, the following points need to be improved: Design Evaluation: Although the model can synthesize skyscraper designs, it currently lacks the capability to automatically evaluate the quality of these designs. Detail Expression: One of the significant limitations of the current model is its inability to capture the intricate details of the skyscraper models accurately. Computational Resources: The process of training the model, especially at high resolutions for detailed synthesis, requires substantial computational power and time Interactive Design Tools: Developing interactive tools that allow architects to manipulate latent vectors directly or specify constraints and preferences could make the technology more practical and appealing for real-world design applications. References https://en. wikipedia. org/wiki/Signed_distance_function https://github. com/fwilliams/point-cloud-utils https://scikit-image. org/docs/stable/auto_examples/edges/plot_marching_cubes. html https://xoft. tistory. com/47 https://velog. io/@qtly_u/Skip-Connection https://arxiv. org/pdf/1901. 05103. pdf https://github. com/facebookresearch/DeepSDF https://github. com/maurock/DeepSDF . .


Latent masses

Latent masses

Objectives Generative Adversarial Networks (GANs) have paved the way for unprecedented advancements in numerous areas, from art creation to deepfake video generation. However, the potential of GANs isn't restricted to 2D space. The development and application of 3D GANs have opened new possibilities, especially in the realm of design.

This project delves deep into the possibilities of 3D GANs in the design field with the following objectives: Grasp the fundamental concepts behind GANs and their 3D extension Appreciate the power and nuances of 3D GANs through hands-on experiments Examine how 3D GANs can be harnessed for product design, architectural modeling, and virtual environment creation visualize and manipulate the latent space to generate novel and innovative designs Understand the limitations of current 3D GAN models and the potential areas of improvement Interpolation in latent space By the way, what is GANs(Generative Adversarial Networks)? 🧬 Generative Adversarial Networks, commonly referred to as GANs, are a class of artificial intelligence algorithms designed to generate new data that resemble a given set of data. The architecture of a GAN consists of two primary components: 1. Generator The role of the generator is to create fake data It takes in random noise from a latent space and produces data samples as its output The primary objective of the generator is to produce data that is indistinguishable from real data 2. Discriminator The discriminator functions as a binary classifier It aims to differentiate between real and fake data The discriminator receives both real data samples and the fake data generated by the generator, and its task is to correctly label them as 'real' or 'fake' The provided diagram illustrates this process, showing how the generator's output is evaluated by the discriminator, resulting in a loss that helps both parts improve. Generative adversarial networks concept diagram 3D shape representations for the generative adversarial networks 1. Point cloud A point cloud is a set of data points in space. In 3D shape representation, point clouds are typically used to represent the external surface of an object Each point in the point cloud has an (x, y, z) coordinate Can represent any 3D shape without being limited to a specific topology or grid (Good at flexibility) Points are disconnected, so additional processing is often required to extract surfaces or other features (Not good at lack of connectivity) Shape representation for point cloud 2. Voxel Voxels (short for volumetric pixels) are the 3D equivalent of 2D pixels. A voxel representation divides the 3D space into a regular grid, and each cell (or voxel) in the grid can be either occupied or empty Operations like convolution are straightforward to apply on voxel grids (Simplicity) To represent fine details, a very high-resolution grid is needed, which can be computationally prohibitive (Limited resolution) Shape representation for voxel 3. Mesh A 3D mesh consists of vertices, edges, and faces that define the shape of a 3D object in space. The most common type of mesh is a triangular mesh, where the shape is represented using triangles Can represent both simple and complex geometries (Good expressiveness) Provides information about how points are connected, which is useful for many applications (Good at continuous surface representation) Operations on meshes, like subdivision or simplification, can be computationally demanding (Complexity) Shape representation for mesh Simple implementation: A single sphere GAN First, I'll implement a practical application of training a GAN on point cloud data, aiming to generate a single sphere, represented by point cloud Before implementing the neural networks, we begin by loading our target sphere point cloud from a file. I modeled just one sphere shape using Rhino. Typically, the normalization can be particularly beneficial if your training data consists of similar objects in various sizes or if the absolute size isn't critical for your task. In our dataset concerning a single sphere, the absolute size is not of significance. Therefore, let's normalize it. The sphere can be normalized easily using numpy as follows: class Normalize: def __call__(self, pointcloud): assert len(pointcloud. shape) == 2 norm_pointcloud = pointcloud - np. mean(pointcloud, axis=0) norm_pointcloud /= np. max(np. linalg. norm(norm_pointcloud, axis=1)) return norm_pointcloud If different 3D models have a different number of vertices, sampling a consistent number of points from each model ensures that the input size remains uniform. This is crucial when feeding data to neural networks that expect consistent input sizes. Please refer to the following link for the code related to PointSampler. A sphere, represented by point cloud From the left, original sphere · random sampled sphere · normalized and random sampled sphere Now, we have completed the data preprocessing and it is now ready for training model. Let us establish and train models comprising a simple generator and discriminator as follows: class Generator(nn. Module): def __init__(self, input_dim=3, output_dim=3, hidden_dim=128): super(Generator, self). __init__() self. fc1 = nn. Linear(input_dim, hidden_dim) self. fc2 = nn. Linear(hidden_dim, hidden_dim) self. fc3 = nn. Linear(hidden_dim, hidden_dim) self. fc4 = nn. Linear(hidden_dim, output_dim) def forward(self, x): x = torch. relu(self. fc1(x)) x = torch. relu(self. fc2(x)) x = torch. relu(self. fc3(x)) x = torch. tanh(self. fc4(x)) return x class Discriminator(nn. Module): def __init__(self, input_dim=3, hidden_dim=128): super(Discriminator, self). __init__() self. fc1 = nn. Linear(input_dim, hidden_dim) self. fc2 = nn. Linear(hidden_dim, hidden_dim) self. fc3 = nn. Linear(hidden_dim, hidden_dim) self. fc4 = nn. Linear(hidden_dim, 1) def forward(self, x): x = torch. relu(self. fc1(x)) x = torch. relu(self. fc2(x)) x = torch. relu(self. fc3(x)) x = torch. sigmoid(self. fc4(x)) return x The comprehensive code, which includes details on the generator, discriminator, data, training process, and more, can be found at the following link. Additionally, the training process visualized using Matplotlib can be viewed below. Upon examining the loss status graph, it becomes evident that a sphere begins its generation around the 2700-epoch mark. Subsequent to this point, the loss values cease to oscillate and exhibit a convergent graph. Training process of a single sphere GAN From the left, losses status · generated point cloud sphere Implementing MassGAN 🧱 From the above, we have gained some understanding of GANs through the implementation of fundamentals and a single sphere GAN. Now, based on this understanding, let's train the model with buildings (Masses) designed by architects and create a generator that produces fake Masses The procedure for the implementation of MassGAN follows the below processes: Preparation and preprocessing of the dataset Implementation of models and training them Evaluating generator, and exploration for the latent spaces Preparation and preprocessing of the dataset I collected building models designed by several famous architects for model training. The figure below shows the actual buildings from the modeling data I gathered. Voxel-shaped buildings From the left, RED7(MVRDV architects) · 79 and Park(BIG architects) · Mountain dwelling(BIG architects) The buildings aforementioned possess a common characteristic: their voxel-shaped configuration. As stated above, we learned three modalities of 3D shape representations pertinent to GANs. The primary limitation of the voxel-shaped representation lies in its challenge to articulate high-resolution. However, within the realm of architectural design, this constraint might be reconceived as an opportunity. The voxel-shaped form is prevalently utilized in the architecture field, and there is no imperative demand for high-resolution depictions of such forms. Therefore, we'll create a generative model that generates masses like to the aforementioned using voxel data with appropriate resolutions. Firstly, to train models utilizing modeling data, it is imperative to transform the data structure from the . obj format to the more suitable . binvox format. The . binvox format delineates data as a binary voxel grid structure, representing True (1) for solid regions and False (0) for vacant spaces. Let us look the illustrative example that is preprocessed to the . binvox format below. Binary voxel grid representations From the left, Given sphere · Voxelated sphere · Binary voxel grid(9th voxels grid) These were aforementioned in the part of my postings titled Voxelate As stated above in the binary voxel grid, one can observe the vacant regions are represented by 0s, while the solid regions are denoted by 1s. All detailed code of preprocessing to the . binvox format is showing in the following link and I preprocessed to possess 32 x 32 x 32 resolution for 6 models below utilizing it. Preprocessed data to the binary voxel grid utilizing binvox From the top left, 79 and Park · Lego tower, RED7 From the bottom left, Vancouver house · CCTV Headquarter, Mountain dwelling Implementation of models and training them We have now completed the procedures for data collection and preprocessing. Subsequently, we are now poised to commence the implementation of both the generator and the discriminator. Therefore I implemented DCGAN with a gradient penalty(WGAN) by referring to the GitHub repositories where several 3D generation models are implemented as follows. The comprehensive code delineating the model definitions can be accessed at the following link: massGAN/model. py class Generator(nn. Module, Config): def __init__(self, z_dim, init_out_channels: int = None): super(). __init__() out_channels_0 = self. GENERATOR_INIT_OUT_CHANNELS if init_out_channels is None else init_out_channels out_channels_1 = int(out_channels_0 / 2) out_channels_2 = int(out_channels_1 / 2) self. main = nn. Sequential( nn. ConvTranspose3d(z_dim, out_channels_0, kernel_size=4, stride=1, padding=0, bias=False), nn. BatchNorm3d(out_channels_0), nn. ReLU(True), nn. ConvTranspose3d(out_channels_0, out_channels_1, kernel_size=4, stride=2, padding=1, bias=False), nn. BatchNorm3d(out_channels_1), nn. ReLU(True), nn. ConvTranspose3d(out_channels_1, out_channels_2, kernel_size=4, stride=2, padding=1, bias=False), nn. BatchNorm3d(out_channels_2), nn. ReLU(True), nn. ConvTranspose3d(out_channels_2, 1, kernel_size=4, stride=2, padding=1, bias=False), nn. Sigmoid() ) self. to(self. DEVICE) def forward(self, x): return self. main(x) class Discriminator(nn. Module, Config): def __init__(self, init_out_channels: int = None): super(). __init__() out_channels_0 = self. DISCRIMINATOR_INIT_OUT_CHANNELS if init_out_channels is None else init_out_channels out_channels_1 = out_channels_0 * 2 out_channels_2 = out_channels_1 * 2 self. main = nn. Sequential( nn. Conv3d(1, out_channels_0, kernel_size=4, stride=2, padding=1, bias=False), nn. LeakyReLU(0. 2, inplace=True), nn. Conv3d(out_channels_0, out_channels_1, kernel_size=4, stride=2, padding=1, bias=False), nn. BatchNorm3d(out_channels_1), nn. LeakyReLU(0. 2, inplace=True), nn. Conv3d(out_channels_1, out_channels_2, kernel_size=4, stride=2, padding=1, bias=False), nn. BatchNorm3d(out_channels_2), nn. LeakyReLU(0. 2, inplace=True), nn. Conv3d(out_channels_2, 1, kernel_size=4, stride=1, padding=0, bias=False), nn. Sigmoid() ) self. to(self. DEVICE) def forward(self, x): return self. main(x). view(-1, 1). squeeze(1) We further defined the MassganTrainer for model supervision, including model training, evaluation, and storage. Throughout this process, I monitored any issues that occurred during the training phase. The recorded outcomes are presented below: Visualized training process at each 200 epochs from 0 to 20000 From the top, losses status · generated masses when training model Contrary to the a single sphere GAN that we previously trained, MassGAN does not exhibit a loss value converging to a singular point due to the complexity of the data. Neverthelesee, if you compare the early and final stages of learning, you can observe that the loss value oscillates within a low range. Furthermore, by observing the monitored fake masses, one can discern that they progressively approximate the shapes of real masses. Evaluating generator, and exploration for the latent spaces The parameters for model training, such as learning rate, batch size, noise dimension, and so forth, were used as follows: class ModelConfig: """Configuration related to the GAN models """ DEVICE = "cpu" if torch. cuda. is_available(): DEVICE = "cuda" SEED = 777 GENERATOR_INIT_OUT_CHANNELS = 256 DISCRIMINATOR_INIT_OUT_CHANNELS = 64 EPOCHS = 20000 LEARNING_RATE = 0. 0001 BATCH_SIZE = 6 BATCH_SIZE_TO_EVALUATE = 6 Z_DIM = 128 BETAS = (0. 5, 0. 999) LAMBDA_1 = 10 LOG_INTERVAL = 200 Now, let's load and evaluate the model trained with the corresponding ModelConfig. In GAN, it is important to evaluate the model quantitatively as the status of loss, but qualitatively evaluating the data generated by the Generator is also effective in evaluating the model. The following figures are generated masses by MassGAN model through the utilization of the evaluate function. Generated masses by MassGAN model All in all, it appears to produce decent data. Subsequently, let's select some of the masses created by the generator and observe the interpolation of latent mass shapes between them. Interpolation in latent space From the left, RED7 · interpolating · CCTV Headquarter Interpolation in latent space From the left, Lego tower · interpolating · Mountain dwelling Interpolation in latent space From the left, Vancouver house · interpolating · Lego tower References https://medium. com/hackernoon/latent-space-visualization-deep-learning-bits-2-bd09a46920df https://github. com/ChrisWu1997/SingleShapeGen https://github. com/znxlwm/pytorch-MNIST-CelebA-GAN-DCGAN https://developer-ping9. tistory. com/108 . .


Drafting with AI

Drafting with AI

Stable diffusion Stable Diffusion is a deep learning, text-to-image model released in 2022. It is primarily used to generate detailed images conditioned on text descriptions, though it can also be applied to other tasks such as inpainting, outpainting, and generating image-to-image translations guided by a text prompt.

An image generated by Stable Diffusion based on the text prompt: "a photograph of an astronaut riding a horse" If you use this AI to create images, you can get plausible rendered images easily and shortly with a few lines of text. Let's think about how this can be utilized in the field of architecture. Application in the field of architecture There are several rendering engines that are frequently used in the field of architectural design, but they require precise modeling for good rendering. At the start of a design project, we can throw a very rough model or sketch to the AI, and type in the prompts to get a rough idea of what we're envisioning. First, let's see the simple example. The example below is an example of capturing the rhino viewport in Rhino's GhPython environment and then rendering the image in the desired direction using ControlNet of the stable diffusion API. Demo for using stable diffusion in Rhino environment Prompt: Colorful basketball court, Long windows, Sunlight First, We need to write code for capturing the rhino viewport. I made it to be saved in the path where the current *. gh file is located. def capture_activated_viewport( self, save_name="draft. png", return_size=False ): """ Capture and save the currently activated viewport to the location of the current *. gh file path """ save_path = os. path. join(CURRENT_DIR, save_name) viewport = Rhino. RhinoDoc. ActiveDoc. Views. ActiveView. CaptureToBitmap() viewport. Save(save_path, Imaging. ImageFormat. Png) if return_size: return save_path, viewport. Size return save_path Next, we should to run our local API server after cloning stable-diffusion-webui repository. Please refer this AUTOMATIC1111/stable-diffusion-webui repository for settings required for running local API server. When all settings are done, now you can request the methods you want through API calls. I referred API guide at AUTOMATIC1111/stable-diffusion-webui/wiki/API Drafting Python serves a built-in module called urllib to use HTTP. You can check the code to the API request used in the GhPython at this link. import os import json import Rhino import base64 import urllib2 import scriptcontext as sc import System. Drawing. Imaging as Imaging class D2R: """Convert Draft to Rendered using `stable diffusion webui` API""" def __init__( self, prompt, width=512, height=512, local_url="http://127. 0. 0. 1:7860" ): self. prompt = prompt self. width = width self. height = height self. local_url = local_url (. . . ) def render(self, image_path, seed=-1, steps=20, draft_size=None): payload = { "prompt": self. prompt, "negative_prompt": "", "resize_mode": 0, "denoising_strength": 0. 75, "mask_blur": 36, "inpainting_fill": 0, "inpaint_full_res": "true", "inpaint_full_res_padding": 72, "inpainting_mask_invert": 0, "initial_noise_multiplier": 1, "seed": seed, "sampler_name": "Euler a", "batch_size": 1, "steps": steps, "cfg_scale": 4, "width": self. width if draft_size is None else draft_size. Width, "height": self. height if draft_size is None else draft_size. Height, "restore_faces": "false", "tiling": "false", "alwayson_scripts": { "ControlNet": { "args": [ { "enabled": "true", "input_image": self. _get_decoded_image_to_base64(image_path), "module": self. module_pidinet_scribble, "model": self. model_scribble, "processor_res": 1024, }, ] } } } request = urllib2. Request( url=self. local_url + "/sdapi/v1/txt2img", data=json. dumps(payload), headers={'Content-Type': 'application/json'} ) try: response = urllib2. urlopen(request) response_data = response. read() rendered_save_path = os. path. join(CURRENT_DIR, "rendered. png") converted_save_path = os. path. join(CURRENT_DIR, "converted. png") response_data_jsonify = json. loads(response_data) used_seed = json. loads(response_data_jsonify["info"])["seed"] used_params = response_data_jsonify["parameters"] for ii, image in enumerate(response_data_jsonify["images"]): if ii == len(response_data_jsonify["images"]) - 1: self. _save_base64_to_png(image, converted_save_path) else: self. _save_base64_to_png(image, rendered_save_path) return ( rendered_save_path, converted_save_path, used_seed, used_params ) except urllib2. HTTPError as e: print("HTTP Error:", e. code, e. reason) response_data = e. read() print(response_data) return None if __name__ == "__main__": CURRENT_FILE = sc. doc. Path CURRENT_DIR = "\\". join(CURRENT_FILE. split("\\")[:-1]) prompt = ( """ Interior view with sunlight, Curtain wall with city view Colorful Sofas, Cushions on the sofas Transparent glass Table, Fabric stools, Some flower pots """ ) d2r = D2R(prompt=prompt) draft, draft_size = d2r. capture_activated_viewport(return_size=True) rendered, converted, seed, params = d2r. render( draft, seed=-1, steps=50, draft_size=draft_size ) Physical model From the left, Drfat view · Rendered view Prompt: Isometric view, River, Trees, 3D printed white model with illumination Interior view From the left, Drfat view · Rendered view Prompt: Interior view with sunlight, Curtain wall with city view, Colorful sofas, Cushions on the sofas, Transparent glass table, Fabric stools, Some flower pots Floor plan From the left, Drfat view · Rendered view Prompt: Top view, With sunlight and shadows, Some flower pots, Colorful furnitures, Conceptual image Skyscrappers in the city From the left, Drfat view · Rendered view Prompt: Skyscrapers in the city, Night scene with many stars in the sky, Neon sign has shine brightly, Milky way in the sky Contour From the left, Drfat view · Rendered view Prompt: Bird's eye view, Colorful master plan, Bright photograph, River Gabled houses From the left, Drfat view · Rendered view Prompt: Two points perspective, White clouds, Colorful houses, Sunlight Kitchen From the left, Drfat view · Rendered view Prompt: Front view, Kitchen with black color interior, Illuminations . .


K-Rooms clusters

K-Rooms clusters

K-Means clustering The K-Means clustering algorithm is the algorithm that clusters to a K number when given data. And it operates as a way to minimize between each cluster about distance difference. This algorithm is a type of the Unsupervised-Learning and serves to label unlabeled input data.

The K-Means algorithm belongs to a partitioning method among clustring method. A partitioning method is a way of splitting that divides multiple partitions when given data. For example, let's assume that n data objects are input. That's when partitioning method divides the given data into K groups less than N, at this time, each group forms a cluster. That is, dividing a piece of data into one or more data objects. K-means clustering, divided by 10 Implementation The operation flow of the K-Means clustering consists of 5 steps following: Select up a K (the count of clusters) and enter data Set initial centroids of clusters randomly Assign the data to each cluster based on the nearest centroid Recalculate the new centroids of clusters and re-execute step-4 Terminate if no longer locations of centroids aren't updated Let's implement the K-Means algorithm based on the steps above. First, we define the KMeans object. As input, it receives the number of clusters(K) to divide, points cloud, and iteration_count which is the number of centroid update iterations class KMeans(PointHelper): def __init__(self, points=None, k=3, iteration_count=20, random_seed=0): """KMeansCluster simple implementation using Rhino Geometry Args: points (Rhino. Geometry. Point3d, optional): Points to classify. Defaults to None. if points is None, make random points k (int, optional): Number to classify. Defaults to 3. iteration_count (int, optional): Clusters candidates creation count. Defaults to 20. random_seed (int, optional): Random seed number to fix. Defaults to 0. """ PointHelper. __init__(self) self. points = points self. k = k self. iteration_count = iteration_count self. threshold = 0. 1 import random # pylint: disable=import-outside-toplevel self. random = random self. random. seed(random_seed) Next, Initialize the centroids of clusters as much as the number of K by selecting the given points cloud as much as K randomly. If the initial centroid setting is done, calculate the distance between each centroid and the given points cloud, and assign the data at the cluster which is the closest distance. Now we should update all the centroids of clusters. We need to compute centroids of initial clusters(points cloud clusters) for that. Finally, compute the distance between the updated centroid and the previous centroid. If this distance does not no longer changes, terminate. Otherwise, just iterate on key things which are explained above. def kmeans(self, points, k, threshold): """Clusters by each iteration Args: points (Rhino. Geometry. Point3d): Initialized given points k (int): Initialized given k threshold (float): Initialized threshold Returns: Tuple[List[List[Rhino. Geometry. Point3d]], List[List[int]]]: Clusters by each iteration, Indices by each iteration """ centroids = self. random. sample(points, k) while True: clusters = [[] for _ in centroids] indices = [[] for _ in centroids] for pi, point in enumerate(points): point_to_centroid_distance = [ point. DistanceTo(centroid) for centroid in centroids ] nearest_centroid_index = point_to_centroid_distance. index( min(point_to_centroid_distance) ) clusters[nearest_centroid_index]. append(point) indices[nearest_centroid_index]. append(pi) shift_distance = 0. 0 for ci, current_centroid in enumerate(centroids): if len(clusters[ci]) == 0: continue updated_centroid = self. get_centroid(clusters[ci]) shift_distance = max( updated_centroid. DistanceTo(current_centroid), shift_distance, ) centroids[ci] = updated_centroid if shift_distance Now we can get the point clusters by setting the K, from the code above. And you can see the detailed code for the K-Means at this link. Implemented K-means clustering, divided by 10 From the left, Given Points · Result clusters K-Rooms clusters The K-Means algorithm is used to cluster as much as the number of K, given points, like described above. If you utilize it, you can divide a given architectural boundary as much as the number of K. It means that K number of rooms can be obtained. Before writing the code, set the following order. As input, it receives the building exterior wall line (closed line) and how many rooms to divide into Create an oriented bounding box and create a grid Insert the grid center points into the K-Means clustering algorithm, implemented above Grids are merged based on the indices of the clustered grid center points Search for the shortest path from the core to each room and create a corridor Now let's implement the K-Rooms cluster algorithm. Define the KRoomsClusters class as follows and take as input what you defined in step 1. And inherit the KMeans class. class KRoomsCluster(KMeans, PointHelper, LineHelper, ConstsCollection): """ To use the inherited moduels, refer the link below. https://github. com/PARKCHEOLHEE-lab/GhPythonUtils """ def __init__(self, floor, core, hall, target_area, axis=None): self. floor = floor self. core = core self. hall = hall self. sorted_hall_segments = self. get_sorted_segment(self. hall) self. target_area = target_area self. axis = axis KMeans. __init__(self) PointHelper. __init__(self) LineHelper. __init__(self) ConstsCollection. __init__(self) Next, we create an OBB(oriented bounding box) to create the grid. OBB is for defining the grid xy axes. (See this link to see the OBB creation algorithm) Then, create grid and grid centroid, extract the clustering indices by putting the center points of the grid and the K value, and then merge all the grids that exist in the same cluster. K-Rooms clustering key process, divided by 5 From the left, Grid and centroids creation · Divided rooms def _gen_grid(self): self. base_rectangles = [ self. get_2d_offset_polygon(seg, self. grid_size) for seg in self. sorted_hall_segments[:2] ] counts = [] planes = [] for ri, base_rectangle in enumerate(self. base_rectangles): x_vector = ( rg. AreaMassProperties. Compute(base_rectangle). Centroid - rg. AreaMassProperties. Compute(self. hall). Centroid ) y_vector = copy. copy(x_vector) y_transform = rg. Transform. Rotation( math. pi * 0. 5, rg. AreaMassProperties. Compute(base_rectangle). Centroid, ) y_vector. Transform(y_transform) base_rectangle. Translate( ( self. sorted_hall_segments[0]. PointAtStart - self. sorted_hall_segments[0]. PointAtEnd ) / 2 ) base_rectangle. Translate( self. get_normalized_vector(x_vector) * -self. grid_size / 2 ) anchor = rg. AreaMassProperties. Compute(base_rectangle). Centroid plane = rg. Plane( origin=anchor, xDirection=x_vector, yDirection=y_vector, ) x_proj = self. get_projected_point_on_curve( anchor, plane. XAxis, self. obb ) x_count = ( int(math. ceil(x_proj. DistanceTo(anchor) / self. grid_size)) + 1 ) y_projs = [ self. get_projected_point_on_curve( anchor, plane. YAxis, self. obb ), self. get_projected_point_on_curve( anchor, -plane. YAxis, self. obb ), ] y_count = [ int(math. ceil(y_proj. DistanceTo(anchor) / self. grid_size)) + 1 for y_proj in y_projs ] planes. append(plane) counts. append([x_count] + y_count) x_grid = [] for base_rectangle, count, plane in zip( self. base_rectangles, counts, planes ): xc, _, _ = count for x in range(xc): copied_rectangle = copy. copy(base_rectangle) vector = plane. XAxis * self. grid_size * x copied_rectangle. Translate(vector) x_grid. append(copied_rectangle) y_vectors = [planes[0]. YAxis, -planes[0]. YAxis] y_counts = counts[0][1:] all_grid = [] + x_grid for rectangle in x_grid: for y_count, y_vector in zip(y_counts, y_vectors): for yc in range(1, y_count): copied_rectangle = copy. copy(rectangle) vector = y_vector * self. grid_size * yc copied_rectangle. Translate(vector) all_grid. append(copied_rectangle) union_all_grid = rg. Curve. CreateBooleanUnion(all_grid, self. TOLERANCE) for y_count, y_vector in zip(y_counts, y_vectors): for yc in range(1, y_count): copied_hall = copy. copy(self. hall) copied_hall. Translate( ( self. sorted_hall_segments[0]. PointAtStart - self. sorted_hall_segments[0]. PointAtEnd ) / 2 ) vector = y_vector * self. grid_size * yc copied_hall. Translate(vector) all_grid. extend( rg. Curve. CreateBooleanDifference( copied_hall, union_all_grid ) ) self. grid = [] for grid in all_grid: for boundary in self. boundaries: tidied_grid = list( rg. Curve. CreateBooleanIntersection( boundary. boundary, grid, self. TOLERANCE ) ) self. grid. extend(tidied_grid) self. grid_centroids = [ rg. AreaMassProperties. Compute(g). Centroid for g in self. grid ] Finally, connect each room with a hallway and you're done! (I implemented Dijkstra algorithm for the shortest-path algorithm. You can check the implementation through the following link. I will do a post on this at the next opportunity. ) Corridor creation Limitation of K-Rooms clusters The K-Rooms Clusters algorithm we implemented is still incomplete. The images shown above are valid results when the K value is small. When the K value increases and the number of rooms increases, a problem in which an architecturally appropriate shape cannot be derived. For example below: Improper shapes, divided by 13 From the left, K-Rooms · Result corridor All we need to do further to solve this problem is to define the appropriate shape and create the logic to mitigate it with post-processing. And you can see the whole code of K-Rooms clusters at this link. . .


IFC format study

IFC format study

Objectives The goal is to analyze the structure and characteristics of the IFC (Industry Foundation Classes) format to evaluate its potential applications in construction information modeling. Specifically: Understand IFC format’s data structure and storage system, particularly focusing on how relationships between architectural elements are expressed.

Analyze the characteristics and usage methods of major IFC processing tools such as ifcopenshell-python and blender-bim. Compare with existing Landbook rendering work processes to examine the practical applicability and limitations of IFC-based workflows. Through this, we aim to identify the benefits of the IFC format in terms of data management and interoperability, and derive insights that can be used in future construction information modeling system development. What is IFC? IFC (Industry Foundation Classes) is a CAD data exchange schema for describing architectural, building and construction industry data. It is a platform-neutral, open data schema specification that is not controlled by a single vendor or group of vendors. It was created for information sharing between software and is being developed by buildingSMART based on international standards. https://en. wikipedia. org/wiki/Industry_Foundation_Classes https://m. blog. naver. com/PostView. naver?isHttpsRedirect=true&blogId=silvury14&logNo=10177489179 https://en. wikipedia. org/wiki/BuildingSMART How to Read IFC UI - blender console - blenderBIM (ifcopenshell - python) Loading and Manipulating IFC Files in Blender Sample files https://github. com/myoualid/ifc-101-course/tree/main/episode-01/Resources Sample file IFC Structure and Relationships IFC의 spatial structure Project aggregates Site aggregates Facility: building( bridge, road, railway) aggregates Building storey Contains Products (building elements) IFC spatial structure - Sample file IFC spatial structure diagram file = ifcopenshell. open(r"C:\Users\MAD_SCIENTIST_21. ifc") project = file. by_type("IfcProject")[0] site = project. IsDecomposedBy[0]. RelatedObjects[0] building = site. IsDecomposedBy[0]. RelatedObjects[0] building_storeys = building. IsDecomposedBy[0]. RelatedObjects for building_storey in building_storeys: print(building_storey) sous_sol = building_storeys[0] sous_sol. get_info() rel_contained_structure = sous_sol. ContainsElements[0] rel_contained_structure. RelatedElements for element in rel_contained_structure. RelatedElements: print(element) The elements output using this Blender Python script are identical to the objects shown in the right panel in Blender (shown in the IFC spatial structure - sample file image). 출력 결과 IFC Class Inheritance Structure and Properties IFC classes are broadly divided into rooted classes and non-rooted classes. rooted class Inherited from the ifcroot class https://standards. buildingsmart. org/IFC/DEV/IFC4_3/RC1/HTML/schema/ifckernel/lexical/ifcroot. htm Cardinality: whether required or not The rest are optional, used for tracking items 3 subclasses: ifcObjectDefinition ifcPropertyDefinition ifcRelationship Looking at attribute inheritance shows which attributes are inherited from which parent rooted class code Left-click an object → object properties to view and modify IFC properties ifc properties ifc construction type is a relation - clicking it shows all objects of the same type ifc construction type Attributes and Property Sets Property sets are very important - if they’re not present according to the schema template, they get added as custom properties. There are inherited psets that come from wall types (quantity sets are for numerical properties). attribute와 property set code Let’s access the above content using ifcopenshell code: rooted_entities = file. by_type("IfcRoot") ifc_building_element_entities = set() for entity in rooted_entities: if entity. is_a("IfcBuildingElement"): ifc_building_element_entities. add(entity. is_a()) my_wall = file. by_id("1K9fMEc5bCUxo4LlWWYA9b") my_wall. Description my_wall. OwnerHistory my_wall. Name my_wall. GlobalId my_wall. Description my_wall. Tag my_wall. PredefinedType my_wall. IsTypedBy[0]. RelatingType # similar construction type 보는 것과 유사한 기능. my_wall. IsTypedBy[0]. RelatedObjects my_wall. is_a() my_wall. is_a("IfcRoot") my_wall. is_a("IfcBuildingElement") my_wall. is_a("IfcProduct") my_wall. is_a("IfcWall") Property Sets and Quantity Sets The attributes covered above aren’t sufficient to express all information. In IFC, more information can be expressed through Property Sets (Pset) and Quantity Sets (Qset). Property Set (Pset) A collection of properties defining additional characteristics of objects Standard Psets: Standard property sets defined by buildingSMART (e. g. , Pset_WallCommon) Custom Psets: Property sets that can be defined according to user needs Main property types: Single Value: Single values like strings, numbers, booleans Enumerated Value: Selection from predefined list of values Bounded Value: Numerical values with upper and lower bounds List Value: List of multiple values Table Value: 2D data structures Among the various objects in the isDefinedBy member of IfcWall, there is an object called IfcRelDefinedsByProperties. The code loops through those where the RelatingPropertyDefinition member is an IfcPropertySet. Within the HasProperties of that pset, there are multiple IfcPropertySingleValue objects containing Name, NominalValue, etc. These are stored in a dictionary called props. → props are stored in psets. from blenderbim. bim. ifc import IfcStore file = IfcStore. file path = IfcStore. path my_wall. IsDefinedBy my_wall. IsDefinedBy[0]. RelatingPropertyDefinitio my_wall. IsDefinedBy[1]. RelatingPropertyDefinitio my_wall. IsDefinedBy[2]. RelatingPropertyDefinitio my_wall. IsDefinedBy[3]. RelatingPropertyDefinitio pset = my_wall. IsDefinedBy[3]. RelatingPropertyDefinition pset. HasProperties[0]. Name pset. HasProperties[0]. NominalValue. wrappedValue psets = {} if my_wall. IsDefinedBy: for relationship in my_wall. IsDefinedBy: if relationship. is_a("IfcRelDefinesByProperties") and relationship. RelatingPropertyDefinition. is_a("IfcPropertySet"): pset = relationship. RelatingPropertyDefinition props = {} for property in pset. HasProperties: if property. is_a("IfcPropertySingleValue"): props[property. Name] = property. NominalValue. wrappedValue psets[pset. Name] = props print(pset. Name + " was added!") psets Do we have to do this manually? → util has more sophisticated functionality. import ifcopenshell. util. element ifcopenshell. util. element. get_psets(my_wall, psets_only=True) Quantity Set (Qset) A set containing physical quantity information of objects Main quantity types: Length Area Volume Weight Count Time my_wall. IsDefinedBy[0]. RelatingPropertyDefinition. is_a("IfcQuantitySet") my_wall. IsDefinedBy[0]. RelatingPropertyDefinition. Quantities my_wall. IsDefinedBy[0]. RelatingPropertyDefinition. Quantities[0]. Name my_wall. IsDefinedBy[0]. RelatingPropertyDefinition. Quantities[0]. LengthValue my_wall. IsDefinedBy[0]. RelatingPropertyDefinition. Quantities[3]. AreaValue my_wall. IsDefinedBy[0]. RelatingPropertyDefinition. Quantities[3][3] qsets = {} if my_wall. IsDefinedBy: for relationship in my_wall. IsDefinedBy: if relationship. is_a("IfcRelDefinesByProperties") and relationship. RelatingPropertyDefinition. is_a("IfcQuantitySet"): qset = relationship. RelatingPropertyDefinition quantities = {} for quantity in qset. Quantities: if quantity. is_a("IfcPhysicalSimpleQuantity"): quantities[quantity. Name] = quantity[3] qsets[qset. Name] = quantities print(qset. Name + " was added!") qsets Checking numerical units. The units for each quantity are specified. These quantities can be grouped and exported to CSV, etc. project = file. by_type("IfcProject") project = project[0] for unit in project. UnitsInContext. Units: print(unit) BIM application with python A guide for developing web app front-end prototypes using streamlit. Provides visualization of viewer, CSV downloads, and statistical figures. ifc viewer Prototyping import uuid import time import ifcopenshell import ifcopenshell. guid import json from ifc_builder import IfcBuilder O = 0. 0, 0. 0, 0. 0 X = 1. 0, 0. 0, 0. 0 Y = 0. 0, 1. 0, 0. 0 Z = 0. 0, 0. 0, 1. 0 create_guid = lambda: ifcopenshell. guid. compress(uuid. uuid1(). hex) # IFC template creation filename = "hello_wall. ifc" # https://standards. buildingsmart. org/IFC/DEV/IFC4_3/RC2/HTML/schema/ifcdatetimeresource/lexical/ifctimestamp. htm # NOTE: 초단위. timestamp = int(time. time()) timestring = time. strftime("%Y-%m-%dT%H:%M:%S", time. gmtime(timestamp)) creator = "CKC" organization = "SWK" application, application_version = "IfcOpenShell", "0. 7. 0" project_globalid, project_name = create_guid(), "Hello Wall" # A template IFC file to quickly populate entity instances for an IfcProject with its dependencies template = ( """ISO-10303-21; HEADER; FILE_DESCRIPTION(('ViewDefinition [CoordinationView]'),'2;1'); FILE_NAME('%(filename)s','%(timestring)s',('%(creator)s'),('%(organization)s'),'%(application)s','%(application)s',''); FILE_SCHEMA(('IFC4')); ENDSEC; DATA; #1=IFCPERSON($,$,'%(creator)s',$,$,$,$,$); #2=IFCORGANIZATION($,'%(organization)s',$,$,$); #3=IFCPERSONANDORGANIZATION(#1,#2,$); #4=IFCAPPLICATION(#2,'%(application_version)s','%(application)s',''); #5=IFCOWNERHISTORY(#3,#4,$,. ADDED. ,$,#3,#4,%(timestamp)s); #6=IFCDIRECTION((1. ,0. ,0. )); #7=IFCDIRECTION((0. ,0. ,1. )); #8=IFCCARTESIANPOINT((0. ,0. ,0. )); #9=IFCAXIS2PLACEMENT3D(#8,#7,#6); #10=IFCDIRECTION((0. ,1. ,0. )); #11=IFCGEOMETRICREPRESENTATIONCONTEXT($,'Model',3,1. E-05,#9,#10); #12=IFCDIMENSIONALEXPONENTS(0,0,0,0,0,0,0); #13=IFCSIUNIT(\*,. LENGTHUNIT. ,$,. METRE. ); #14=IFCSIUNIT(\*,. AREAUNIT. ,$,. SQUARE_METRE. ); #15=IFCSIUNIT(\*,. VOLUMEUNIT. ,$,. CUBIC_METRE. ); #16=IFCSIUNIT(\*,. PLANEANGLEUNIT. ,$,. RADIAN. ); #17=IFCMEASUREWITHUNIT(IFCPLANEANGLEMEASURE(0. 017453292519943295),#16); #18=IFCCONVERSIONBASEDUNIT(#12,. PLANEANGLEUNIT. ,'DEGREE',#17); #19=IFCUNITASSIGNMENT((#13,#14,#15,#18)); #20=IFCPROJECT('%(project_globalid)s',#5,'%(project_name)s',$,$,$,$,(#11),#19); ENDSEC; END-ISO-10303-21; """ % locals() ) def run(): print(type(template)) # Write the template to a temporary file # temp_handle, temp_filename = tempfile. mkstemp(suffix=". ifc", text=True) # print(temp_filename) # with open(temp_filename, "w", encoding="utf-8") as f: # f. write(template) # os. close(temp_handle) temp_filename = "temp. ifc" with open(temp_filename, "w", encoding="utf-8") as f: f. write(template) print(template) with open("result-bldg. json", "r", encoding="utf-8") as f: bldg_info = json. load(f) # Obtain references to instances defined in template ifc_file_template = ifcopenshell. open(temp_filename) # IFC hierarchy creation ifc_builder = IfcBuilder(bldg_info, ifc_file_template) # site 생성 site_placement, site = ifc_builder. add_site("Site") # project -> site ifc_builder. add_rel_aggregates("Project Container", ifc_builder. project, [site]) # building 생성 building_placement, building = ifc_builder. add_building("Building", relative_to=site_placement) # site -> building ifc_builder. add_rel_aggregates("Site Container", site, [building]) # storeys 생성 storey_placement_list = [] building_storey_list = [] for i in range(bldg_info["unit_info"]["general"]["floor_count"]): storey_placement, building_storey = ifc_builder. add_storey( f"Storey_{i+1}", building_placement, elevation=i * 6. 0 ) storey_placement_list. append(storey_placement) building_storey_list. append(building_storey) print(building_storey_list) # building -> storeys ifc_builder. add_rel_aggregates("Building Container", building, building_storey_list) ifc_walls = [] # 외벽과 창문 for i, (exterior_wall_polyline_each_floor, exterior_wall_area_each_floor, window_area_each_floor) in enumerate( zip( ifc_builder. ifc_preprocess. exterior_wall_polyline, ifc_builder. ifc_preprocess. exterior_wall_area, ifc_builder. ifc_preprocess. window_area, ) ): for ( exterior_wall_polyline_in_floor_polygon, exterior_wall_area_in_floor_polygon, window_area_in_floor_polygon, ) in zip( exterior_wall_polyline_each_floor, exterior_wall_area_each_floor, window_area_each_floor, ): for (exterioir_wall_polyline, exterior_wall_area, window_area_list) in zip( exterior_wall_polyline_in_floor_polygon, exterior_wall_area_in_floor_polygon, window_area_in_floor_polygon, ): print(storey_placement_list[i]) print(building_storey_list[i]) wall, windows = ifc_builder. add_exterior_wall( storey_placement_list[i], exterioir_wall_polyline, exterior_wall_area, window_area_list ) ifc_walls. append(wall) ifc_builder. add_rel_contained_in_spatial_structure( "Building Storey Container", building_storey_list[i], wall ) print(windows) for window in windows: ifc_builder. add_rel_contained_in_spatial_structure( "Building Storey Container", building_storey_list[i], window ) # 층 바닥 for i, floor_slab_polyline_each_floor in enumerate(ifc_builder. ifc_preprocess. floor_slab_polyline): ifc_floor_slabs_in_a_floor = [] for floor_slab_polyline_each_polygon in floor_slab_polyline_each_floor: floor_slab = ifc_builder. add_floor_slab(0. 2, storey_placement_list[i], floor_slab_polyline_each_polygon) # , building_storey_list[i] print(floor_slab) ifc_floor_slabs_in_a_floor. append(floor_slab) ifc_builder. add_rel_contained_in_spatial_structure( "Building Storey Container", building_storey_list[i], floor_slab, ) # 세대 내벽 ifc_interior_walls = [] for i, (interior_wall_polyline_each_floor, interior_wall_area_each_floor) in enumerate( zip( ifc_builder. ifc_preprocess. interior_wall_polyline, ifc_builder. ifc_preprocess. interior_wall_area, ) ): for ( interior_wall_polyline, interior_wall_area, ) in zip(interior_wall_polyline_each_floor, interior_wall_area_each_floor): print(storey_placement_list[i]) print(building_storey_list[i]) wall = ifc_builder. add_interior_wall( storey_placement_list[i], interior_wall_polyline, interior_wall_area, ) ifc_interior_walls. append(wall) ifc_builder. add_rel_contained_in_spatial_structure( "Building Storey Container", building_storey_list[i], wall ) # material 적용 방식중 맞는 방식 확인 필요. # ifc_builder. add_material_style(ifc_interior_walls, ifc_builder. wall_style) # ifc_builder. add_material_style(ifc_walls, ifc_builder. glass_style) for i, flight_info_each_floor in enumerate(ifc_builder. ifc_preprocess. flight_info): ifc_stair = ifc_builder. add_stair( flight_info_each_floor, storey_placement_list[i], ) ifc_builder. add_rel_contained_in_spatial_structure( "Building Storey Container", building_storey_list[i], ifc_stair ) print(filename) # Write the contents of the file to disk ifc_builder. build() if __name__ == "__main__": run() results Conclusion The IFC format has a much more systematic structure than expected, capable of storing a wide variety of elements. Due to this comprehensive structure, documentation and practical usage can be somewhat complex. The structural characteristic of defining containment and association relationships as entities is particularly impressive and serves as a good reference for future use. While internet resources on IFC creation using ifcopenshell-python are relatively scarce, blender-bim is used as the main tool instead, and information about other programming language bindings and tools is relatively abundant. From a rendering work perspective, compared to existing methods, there isn’t much difference in efficiency as necessary elements still need to be created directly. However, even though the same preprocessing is required, there are advantages in terms of being able to store, provide, and manage this information. References https://www. youtube. com/playlist?list=PLbFY94gzUJhGkxOUZknWupIiBnY5A0KUM https://standards. buildingsmart. org/IFC/RELEASE/IFC4/ADD1/HTML/schema/ifcproductextension/lexical/ifcbuilding. htm . .


Accelerating Reinforcement Batch Inference Speed

Accelerating Reinforcement Batch Inference Speed

Spacewalk is a proptech company that leverages artificial intelligence and data technologies to implement optimal land development scenarios. The architectural design AI, based on reinforcement learning algorithms, creates the most efficient architectural designs.

A representative product of the company is Landbook. Landbook ㅡ Architectural AI Reinforcement learning is a method where an agent, defined within a specific environment, recognizes its current state and selects an action among possible options to maximize rewards. In reinforcement learning using deep learning, the agent learns by performing various actions and receiving rewards accordingly. To produce high-quality design proposals, it is essential to conduct related research swiftly and efficiently. However, the inference process of architectural design AI has required significant time, necessitating improvements. In this post, we will share a case study on improving the inference time of architectural design AI. Inference process of architectural AI The inference process for a single data batch proceeds as follows. In one inference process, the agent and environment interact with each other. The agent generates actions based on the current state. The environment delivers a new state to the agent based on the agent’s actions. When these two steps are repeated N times, one inference process is completed. Inference process of architectural AI AS-IS: Using Environment Package The existing method uses the environment as a Python package. This means that the agent and environment operate on the same computing resources. For environment computations, each operation can be performed independently. Since we typically use computing resources with multiple cores, we process environment operations in parallel according to the number of available cores. Parallel processing when using environment package The number of cores in the computing resources used for training directly affects the training speed. For example, let’s compare cases where we have $4$ cores versus $8$ cores when conducting training with a batch size of $128$. When there are $4$ cores, $128$ environment operations are processed 4 at a time simultaneously. This means that when approximately $128 / 4 = 32$ environment operations are completed on each core, all $128$ environment operations will be finished. Environment operations with 4-core parallel processing On the other hand, with $8$ cores, $128$ environment operations are processed 8 at a time, meaning that after $128 / 8 = 16$ environment operations, all $128$ operations will be completed. Therefore, compared to the $4$-core case, we can expect the environment computation time to be approximately twice as fast. Environment operations with 8-core parallel processing However, the number of cores in any given computing resource is physically limited. In other words, cores cannot be increased beyond a certain number. Additionally, the training batch size may increase as models grow larger or research directions change. Therefore, if environment operations are dependent on the computing resources used for training, the training speed will slow down as the batch size increases. TO-BE: Operating Environment Servers To make environment computations independent of the agent’s computing resources, we changed the environment from a package form to a server form. This means that the computing resources for agent operations and environment operations can be operated independently. Parallel processing when using environment servers When environment computations are done in server form, there is no limit to the number of simultaneous environment operations that can be processed. Compared to the previous method, the environment server approach can process environment operations simultaneously up to the number of environment servers currently in operation. Additionally, since we use AWS resources, there are no specific limitations on the number of environment servers. Implementation Let me introduce how Spacewalk operates the environment servers. Here are the technology stacks we use: FastAPI: Framework for implementing environment servers AWS: Computing resources needed for environment servers Kubernetes: Container orchestration tool for operating environment servers Newrelic: Tool for monitoring environment servers Implementing Environment Servers Using FastAPI We used the FastAPI framework to implement the environment servers. FastAPI is a Python-based framework with the following advantages. Note that the existing environment package was implemented in Python. For implementation convenience, we built the environment server using Python. It’s fast. It shows the best performance among Python frameworks. It adopts development standards. OpenAPI (Swagger UI) can be utilized. It can be developed with low development costs. We found that many companies have received quite positive impressions of FastAPI. FastAPI Reviews The server can be implemented simply with the following code. We implemented it by importing the environment package to use in the environment server. from fastapi import FastAPI, File, UploadFile import os # Environment package from swkgym import step import pickle from fastapi. responses import Response from fastapi. logger import logger as fastapi_logger import os import time import sys app = FastAPI( contact={ "name": "roundtable", "email": "roundtable@spacewalk. tech" } ) @app. on_event("startup") async def startup_event(): pass @app. post('/step') async def step(inputs: UploadFile = File(. . . ),): start = time. time() bytes = await inputs. read() inputs = pickle. loads(bytes) fastapi_logger. log(logging. INFO, "Input is Ready") # Environment computation states = step(inputs["state"], inputs["action"], inputs["p"], inputs["is_training"], inputs["args"]) fastapi_logger. log(logging. INFO, "Packing Input is Start") size_of_response_file = sys. getsizeof(states) fastapi_logger. log(logging. INFO, f"response states size is {size_of_response_file} bytes") fastapi_logger. log(logging. INFO, f"{inputs['p']} Step time: {time. time() - start}") return Response( content=pickle. dumps(states) ) Using AWS for Dynamic Allocation As mentioned above, the environment servers use AWS resources. Since architectural AI training is not a constant event (24 hours, every day), purchasing on-premises equipment involves considerable uncertainty. Therefore, we needed to use computing resources dynamically, and we chose AWS, a leading cloud service provider. Note that AWS’s EKS provides Cluster Autoscaler functionality. This means that as usage increases, the number of nodes (computing resources) that make up the EKS will also increase. Looking at the actual internal usage patterns of environment servers, we can see the following trends. Since environment servers can be utilized when training architectural design AI, we estimated based on GPU usage rates. FastAPI Reviews Unlike services that run constantly, requests don’t occur regularly, and when they do occur, the difference compared to when they don’t is quite significant. Therefore, we determined that dynamic resource allocation was reasonable. Kubernetes(EKS) for Operating Environment Servers We used Kubernetes to manage environment servers in a stable manner while reducing operational costs. For those interested in learning about Kubernetes, we recommend referring to the official Kubernetes documentation. Using Kubernetes allows us to systematically handle much of the container management that was previously done manually. For example, instead of having developers manually restart containers when they go down, the system checks container status and automatically recovers them. This is called Self-Healing. Service discovery and load balancing can also be easily implemented. Environment servers can be exposed externally to allow agent models to access them, and the load on environment servers is distributed using AWS Network Load Balancer. Additionally, Kubernetes provides Autoscaling functionality. There are two types of Autoscaling: Horizontal Pod Autoscaling and Vertical Pod Autoscaling. Horizontal Pod Autoscaler adjusts the number of Pods with identical resources, while Vertical Pod Autoscaling adjusts the resources allocated to Pods. Kubernetes Horizontal Pod Autoscaler As mentioned earlier, environment server requests do not occur at regular intervals. Therefore, operating the same number of environment servers constantly would be inefficient. We needed to increase the number of environment servers when requests come in and decrease them when there are no requests, which is why we used Kubernetes’ HPA. When visualized, the configuration looks like this: Diagram when using HPA Newrelic Adapter for Defining Environment Server External Metrics Above, we mentioned using HPA to adjust the number of environment servers. So what criteria can we use to adjust the number of environment servers? At the most basic level, Kubernetes can scale based on pod resource usage: type: Resource resource: name: cpu target: type: Utilization averageUtilization: 60 CPU Utilization-based scaling However, using pod resource usage as an HPA metric for environment servers had some limitations. In practice, scaling based on CPU usage can lead to the following issues. Let’s assume the average and maximum CPU Utilization for a single request to the environment server is 90%. Setting averageUtilization to 90%: The number of Pods doesn’t increase. Even when requests pile up on a single environment server, the number of Pods doesn’t increase. The environment server processes one request at a time, so CPU Utilization remains at an average of 90%. Setting averageUtilization below 90%: The number of Pods becomes larger than actually needed. With a batch size of 128, the number of environment servers increases to over 128 to bring averageUtilization below 90%. For more precise scaling, we needed to know the number of requests per environment server pod. For example, if we set an appropriate request rate of 4 requests per second per environment server pod, we can scale from 1 to 4 environment server pods when 16 requests per second come in. However, Kubernetes itself doesn’t have the capability to measure request volume for Pods. Instead, we can register custom metrics to use as scaling metrics. Monitoring tools that can be used with Kubernetes, such as Newrelic or Prometheus, make it easy to register custom metrics. Internally, we use Newrelic to monitor our deployed services. Therefore, we decided to use Newrelic for HPA metrics. Newrelic is a powerful monitoring tool. And Newrelic provides the New Relic Metrics Adapter. The New Relic Metrics Adapter registers various metrics provided by Newrelic as Kubernetes Metrics. Newrelic Metrics Adapter Installation can be done as follows. Note that if Newrelic is already installed in your Kubernetes Cluster, you can install the New Relic Metrics Adapter separately. If you meet all of the following requirements, you can easily install it using Helm: Kubernetes 1. 16 or higher. The New Relic Kubernetes integration. New Relic’s user API key. No other External Metrics should be registered. helm upgrade --install newrelic newrelic/nri-bundle \ --namespace newrelic --create-namespace --reuse-values \ --set metrics-adapter. enabled=true \ --set newrelic-k8s-metrics-adapter. personalAPIKey=YOUR_NEW_RELIC_PERSONAL_API_KEY \ --set newrelic-k8s-metrics-adapter. config. accountID=YOUR_NEW_RELIC_ACCOUNT_ID \ --set newrelic-k8s-metrics-adapter. config. externalMetrics. {external_metric_name}. query={NRQL query} Script for installing newrelic metric adapter Earlier, we mentioned that we needed to scale environment servers based on request volume. For this, we defined the following External Metric: external_metric_name: env_servier_request_per_seconds NRQL query: FROM Metric SELECT average(k8s. pod. netRxBytesPerSecond) / {bytes per request} / uniqueCount(k8s. podName) SINCE 1 minute AGO WHERE k8s. deploymentName = 'env-service-deployment' After installation is complete, you can verify that the External Metric is working properly with the following command: $ kubectl get --raw "/apis/external. metrics. k8s. io/v1beta1/namespaces/*/env_servier_request_per_seconds" >>> {"kind":"ExternalMetricValueList","apiVersion":"external. metrics. k8s. io/v1beta1","metadata":{},"items":[{"metricName":"env_servier_request_per_seconds","metricLabels":{},"timestamp":"2022-02-08T02:28:17Z","value":"0"}]} Registered Metric check Once we’ve confirmed the metric is working properly, it’s time to deploy the HPA. We can configure the HPA with the following YAML: apiVersion: autoscaling/v2beta2 kind: HorizontalPodAutoscaler metadata: name: env-service-autoscaling spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: env-service-deployment minReplicas: 1 maxReplicas: 64 metrics - type: External external: metric: name: env_servier_request_per_seconds selector: matchLabels: k8s. namespaceName: default target: type: Value value: "4" Autoscaler using External Metric Now that HPA has been fully deployed, the overall diagram looks like this: Newrelic Metrics Adapter Experiments Let’s examine how much inference speed can be improved when operating environment servers. 64 Batch Inference 1024 Batch Inference Agent server: c4. xlarge, $0. 227 USD per hour Environment server: c4. xlarge/4, $0. 061 USD per hour AWS cost calculation = (hourly agent server cost + hourly environment server cost * number of servers used) * usage time(seconds) / 3600 Note that HPA was not applied in the experiments for clear comparison. There is a difference in inference time between the environment package with 4 CPUs and 4 environment servers. When using CPU inference, the agent model and environment package computations share CPU resources. However, with environment servers, the CPU used by the agent model is separate from the CPU used by the environment package, eliminating processing delays from environmental computations. In the 64-batch inference experiment, we could reduce the time by about half with triple the AWS costs. We could also reduce the time by about 60% compared to the original with almost similar costs. For 1024-batch inference, the parallel processing effect is even more apparent. We could achieve about 4. 5 times speed improvement with approximately 7. 6 times the cost. Conclusion In this post, we covered methods to accelerate batch inference of architectural design AI. When the agent and environment are bound to a single host, the number of cores available for parallel processing is inevitably limited. To solve this problem, we operated the environment in server form so that the agent and environment could run on independent computing resources. Additionally, since environment server demand is not constant, we utilized cloud services for dynamic computing resources (Cluster Autoscaling) rather than building on-premises resources. We also used Kubernetes’ HPA to automatically adjust the number of environment servers. We plan to use this batch inference acceleration to improve both the training speed of architectural design AI and service performance. At Spacewalk, we actively adopt various technology stacks. We will continue to share various internal case studies through future posts. . .


Improving AVM with Duplicate Data Integration

Improving AVM with Duplicate Data Integration

Topic When using transaction data for both multi-family housing and officetels, there exists a class imbalance where officetel data is about one-tenth the level of multi-family housing data. When using the combined data without any preprocessing, we discovered that the model was relatively more fitted to multi-family housing data compared to officetels during training, resulting in higher predicted prices for officetels.

Therefore, we aim to experiment with methods to reduce bias caused by data imbalance when developing an integrated model for multi-family housing and officetels. Method The first approach that can be considered to solve the data imbalance problem is customizing the loss function. However, it was difficult to arbitrarily modify the loss function in the currently used AutoGluon package, and customization had limitations as a subjective method. Therefore, we tested the hypothesis that “Data Duplication can be used instead of Loss to make the impact of specific data points’ loss greater on the overall loss function” - an idea that meets our criteria of being both independent from the AutoGluon package and more systematic. Result & Analysis Changes by Area and Type When plotting the changes by area and type, we can confirm that the integrated model using duplicate data generally shows lower results compared to the existing integrated model. However: While officetel results show linearity between the two models, multi-family housing shows no particular linear relationship, and there are cases where the differences between the two models are relatively large. For officetels, as the size increases, there were many extreme cases of differences between the two models (wider vertical and horizontal ranges, and more points deviating from the trend). Figure Left: Multi-family housing / Right: Officetel X-axis: Existing integrated model / Y-axis: Integrated model using duplicate data Change Amount Scatter Plot We mapped the changes on a map to check if there were differences in the amount of change by region. For multi-family housing, we can see many cases with large differences between the two models in the Gangseo area, while for officetels, many differences were observed in the Seodaemun-gu area. Through additional experiments and investment team QA, we believe it’s necessary to identify what commonalities exist between regions with significant differences. Figure Left: Multi-family housing / Right: Officetel Color: Amount of change . .


Understanding Form Through Algorithms

Understanding Form Through Algorithms

Introduction Contemporary science often reveals that the complex patterns we observe in nature actually arise from simple rules repeating and interacting. This phenomenon is called emergence, where each individual element behaves in a seemingly straightforward way, but together they form unpredictable and elaborate patterns or structures. Computer algorithms are exceptionally useful for exploring this phenomenon.

By coding and simulating these rules, we can observe—step by step—how forms emerge, rather than just imitating nature or producing results. In other words, algorithms have evolved into a powerful methodology for analyzing the logic behind form generation. Complexity Science and Algorithmic Models Computational Models vs. Observed Forms In complexity science, researchers build simplified computational models of natural phenomena based on rules, then run computer simulations and compare the simulated outputs to real-world observations. The goal isn’t to replicate the physical or chemical processes of nature exactly. Rather, it’s to capture the procedure by which forms emerge, and then progressively refine and re-run the experiments. By adjusting parameters and comparing each result to actual data, we gradually uncover how complicated forms arise. Hence, algorithms aren’t merely used to imitate nature but also serve as a key tool for understanding and analyzing shape formation. Flocking effect Boid cohesion D’Arcy Thompson and Alan Turing The view that shape generation can be understood through rule-based systems was systematically introduced in D’Arcy Wentworth Thompson’s On Growth and Form (1917). While Darwinian evolution dominated thinking at the time, Thompson emphasized the impact of physical forces on biological forms. He illustrated that structures like jellyfish bells or avian bone shapes could be explained using geometric and physical principles, and that fish body shapes, for instance, could be continuously transformed via grid transformations. In 1952, mathematician Alan Turing proposed the reaction–diffusion model, showing that even extremely simple chemical reactions and diffusion processes could give rise to complex patterns like animal stripes or spots. This research was groundbreaking, demonstrating that genes don’t directly “paint” colors, but rather that local reaction rules spontaneously organize the pattern. This confirmed that emergence could derive from local rule interactions. Major Algorithms for Explaining Natural Patterns Scientists have harnessed these concepts to devise numerous algorithmic models that simulate and study form generation in nature. Fractal Geometry and Self-similarity: Many structures in nature—such as tree branches, lightning, and Romanesco broccoli—exhibit fractal properties. When you zoom into a part of these forms, the smaller portion resembles the shape of the whole, a concept known as self-similarity. Benoît Mandelbrot’s fractal geometry shows that applying recursive rules can re-create even the seemingly boundless complexity of coastlines and mountain outlines. Fractal patterns Cellular Automata: Conway’s Game of Life is a classic example. Each cell in a grid follows a simple local rule—living or dying based on the state of neighboring cells—yet over time, the overall system produces unexpectedly complex patterns. This hints that it’s possible to mimic phenomena like animal coat markings or city growth using only simple rules. Cellular automata illustrate how individual rules can lead to global complexity. Cellular Automata Zebra Patterns Reaction-Diffusion Systems: Alan Turing’s theory proposed that patterns like spots or stripes emerge when two or more chemical substances react and diffuse at different rates, naturally creating stripes or spots. One chemical acts as an activator, while another acts as an inhibitor, and differences in their diffusion speeds lead to a pattern spontaneously dividing what was initially a uniform space. This type of model is widely used in biology, chemistry, and art to explain how intricate patterns form. Diffusion limited aggregation Diffusion-Limited Aggregation (DLA): In this model, particles moving randomly will stick to a growing cluster, forming intricate branching or fractal-like crystal structures—similar to how lightning, frost, or mineral deposits form. Although each particle’s motion is random, the collective behavior results in branching growth. These algorithmic approaches not only replicate the outward appearance of nature, but also serve as a tool for investigating the step-by-step logic by which forms develop. Algorithmic Explanations of Design Forms Applications in Design Algorithms originally used to describe forms in nature also have significant relevance for design. In the past, algorithms might have been viewed mainly as tools for generating “complex and novel shapes,” but nowadays they’re also employed to analyze and explain why certain forms appear as they do: Making generative rules for shape creation explicit gives designs transparency and logical clarity. Iterative experimentation shows immediately how altering parameters changes the resulting form. Richard Serra and Verb Lists During the late 1960s, sculptor Richard Serra compiled a list of action verbs such as “to roll, to crease, to fold, to cut…” and applied them sequentially in his work. Each action (fold, cut, bend) might seem simple, but cumulatively, they produce intricate sculptural results. In algorithmic terms, each verb is a rule that transforms a shape in a specific way. If a computer were to simulate these steps, we could explicitly see: “Step 1 did X, Step 2 did Y, which led to this final shape. ” Hence, Serra’s sculptures are not just the product of artistic intuition—they can also be seen as the product of “a series of transformation rules. ” An algorithmic model clarifies the logical mechanism behind the art-making process. Parametric and Procedural Design In parametric design, you don’t fix the final shape. Instead, you define parameters and rules (procedures). Suppose you specify column spacing, curvature, or height as inputs: the algorithm calculates the geometry in real time. Rather than controlling the shape directly, you focus on the rules that generate the shape. If you widen columns by 20%, you can quickly see how it alters the roof curve. Even in complex structures, as long as you know the rules and parameters, repeated testing and optimization become straightforward. Thus, parametric design clarifies the intent behind the form and explains “why this shape resulted,” offering a transparent logical structure. Examples of Action Verbs From the left, to split, to fill action verbs Combining Creativity with Logic Using algorithms to explain design doesn’t mean discarding artistic intuition. On the contrary, transforming a conceptual idea into explicit rules allows for rapid testing of “What if I tweak this parameter?”—leading to new possibilities. Design knowledge isn’t confined to a designer’s mind alone; it can become a transparent system that facilitates collaboration and verification. Conclusion Algorithmic approaches aren’t limited to researching natural phenomena; they’re equally powerful in creative fields, such as design, for clarifying how forms emerge and why they appear as they do. In nature, simple local rules can combine to produce emergent complexity. In design, focusing on parameters and rules makes generation processes more transparent and analyzable. Through algorithms, we don’t just examine the final shape—we delve into how it was formed. By integrating artistic intuition with computational logic, we unlock both creative expansion and verifiable structures. Designers can conduct repeated experiments in much the same way that scientists explore optimal solutions. This algorithmic mindset bridges art and science, fostering broader collaboration and discourse, while shedding light on the underlying logic of form. . .