robot

SelfGoal: Your Language Agent Already Knows How to Achieve Advanced Goals

Language agents powered by large language models (LLMs) are becoming increasingly valuable as decision-making tools in various fields such as gaming and programming. However, these agents often face the challenge of achieving high-level goals without detailed specifications and adapting to environments with delayed feedback. In this paper, we introduce SelfTarget, a novel automatic method designed to enhance the agent's ability to achieve high-level goals under limited human priors and environmental feedback. The core concept of SelfTarget involves an adaptive tree structure that breaks down high-level goals into more practical sub-goals during interactions with the environment, while identifying the most useful sub-goals and updating the structure progressively. Experimental results show that SelfTarget significantly enhances the performance of language agents in various tasks, including competitive, cooperative, and delayed feedback environments.

In today's technological landscape, language agents powered by large language models (LLMs) are emerging and demonstrating increasing value as decision-making tools in various fields, including gaming and programming.

In the gaming domain, language agents can assist players in analyzing complex game situations and providing strategic decision-making advice. For example, in strategy games, they can develop reasonable development paths and combat strategies for players based on the current map layout, resource distribution, and the situation of various forces. In programming, language agents can help programmers understand complex code logic, provide ideas for code optimization, and even automatically generate some basic code frameworks based on requirements.

However, these language agents do not always operate smoothly in practical applications, and they often face a series of challenging issues. A key challenge is that they need to achieve high-level goals without detailed specifications. For example, in a complex game task, a high-level goal such as "win the game" may be given without providing specific implementation steps and methods. This requires the language agent to have strong analytical and planning capabilities, enabling it to explore feasible action paths from this broad goal.

Additionally, these agents need to adapt to environments with delayed feedback. In many practical scenarios, the actions of the agents do not receive immediate feedback from the environment. For example, in an online multiplayer game, due to factors such as network latency, the actions taken by the agent may take some time to see their impact on the game situation. In the programming process, the results of code execution may not be immediately apparent, especially when dealing with large-scale data or complex algorithms, which may take a longer time to determine whether the code has been executed correctly and whether it has achieved the expected effect.

In this paper, we focus on an innovative method called SelfTarget. This is an automatic method specifically designed to address the problems faced by the aforementioned language agents, with the core aim of significantly enhancing the agent's ability to achieve high-level goals under limited human priors and environmental feedback.

The core concept of SelfTarget mainly revolves around a tree structure. During interactions with the environment, it can adaptively break down high-level goals into a series of more practical sub-goals. This breakdown is not a one-time process but is continuously adjusted and optimized as interactions with the environment progress. For example, when faced with the high-level goal of "winning the game," it may first break it down into sub-goals such as "acquire resources," "enhance self-strength," and "understand the opponent's situation." Then, as the game progresses, it further refines these sub-goals based on the actual situation, such as breaking down "acquire resources" into more specific sub-goals like "collect specific types of resources" and "acquire resources through trade."

At the same time, SelfTarget identifies the most useful sub-goals during the goal decomposition process. It evaluates each sub-goal based on current environmental information, its own status, and past experiences to determine which sub-goals are more critical for achieving the high-level goal. For example, at a certain stage in the game, if it finds that its strength is relatively weak and the opponent is expanding rapidly, it may determine that the sub-goal of "enhancing self-strength" is the most critical task at the moment and prioritize concentrating resources and efforts to achieve it.

Moreover, SelfTarget will gradually update this tree structure. As the environment changes and it gains a better understanding of the environment, it continuously adjusts the relationships between sub-goals and the specific content of each sub-goal. For example, when a new method of acquiring resources is discovered or when a previously unimportant sub-goal becomes crucial, it will promptly update the corresponding tree structure to ensure that the entire goal planning can always adapt to environmental changes and actual needs.

Through a series of experiments, we have obtained encouraging results. The experiments show that SelfTarget significantly enhances the performance of language agents in various types of tasks. Whether in competitive gaming environments, such as multiplayer competitive games, where language agents can better formulate strategies and increase the chances of winning; or in cooperative task scenarios, such as team programming projects, where it can better coordinate the work of all parties, clarify their respective sub-goals, and improve project completion efficiency; even in environments with delayed feedback, it can also better cope with environmental uncertainties and achieve high-level goals through reasonable goal decomposition and dynamic adjustments.