Skip to content

Commit 82f0a43

Browse files
autogen subpackage (#968)
* math utils in autogen * cleanup * code utils * remove check function from code response * comment out test * GPT-4 * increase request timeout * name * logging and error handling * better doc * doc * codegen optimized * GPT series * text * no demo example * math * import openai * import openai * azure model name * azure model name * openai version * generate assertion if necessary * condition to generate assertions * init region key * rename * comments about budget * prompt --------- Co-authored-by: Susan Xueqing Liu <[email protected]>
1 parent 7f9402b commit 82f0a43

20 files changed

+5249
-3636
lines changed

README.md

Lines changed: 18 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,9 +23,9 @@
2323
## What is FLAML
2424
FLAML is a lightweight Python library that finds accurate machine
2525
learning models automatically, efficiently and economically. It frees users from selecting
26-
models and hyperparameters for each model. It can also be used to tune generic hyperparameters for large language models (LLM), MLOps/LMOps workflows, pipelines, mathematical/statistical models, algorithms, computing experiments, software configurations and so on.
26+
models and hyperparameters for each model. It can also be used to tune generic hyperparameters for foundation models, MLOps/LMOps workflows, pipelines, mathematical/statistical models, algorithms, computing experiments, software configurations and so on.
2727

28-
1. For common machine learning or AI tasks like classification, regression, and generation, it quickly finds quality models for user-provided data with low computational resources. It supports both classical machine learning models and deep neural networks, including large language models such as the OpenAI GPT-3 models.
28+
1. For common machine learning or AI tasks like classification, regression, and generation, it quickly finds quality models for user-provided data with low computational resources. It supports both classical machine learning models and deep neural networks, including foundation models such as the GPT series.
2929
1. It is easy to customize or extend. Users can find their desired customizability from a smooth range: minimal customization (computational resource budget), medium customization (e.g., scikit-style learner, search space and metric), or full customization (arbitrary training and evaluation code).
3030
1. It supports fast automatic tuning, capable of handling complex constraints/guidance/early stopping. FLAML is powered by a new, [cost-effective
3131
hyperparameter optimization](https://microsoft.github.io/FLAML/docs/Use-Cases/Tune-User-Defined-Function/#hyperparameter-optimization-algorithm)
@@ -95,6 +95,22 @@ estimator = LGBMRegressor()
9595
estimator.fit(X_train, y_train)
9696
```
9797

98+
* (New) You can optimize [generations](https://microsoft.github.io/FLAML/docs/Use-Cases/Auto-Generation) by ChatGPT or GPT-4 etc. with your own tuning data, success metrics and budgets.
99+
100+
```python
101+
from flaml import oai
102+
103+
config, analysis = oai.Completion.tune(
104+
data=tune_data,
105+
metric="success",
106+
mode="max",
107+
eval_func=eval_func,
108+
inference_budget=0.05,
109+
optimization_budget=3,
110+
num_samples=-1,
111+
)
112+
```
113+
98114
## Documentation
99115

100116
You can find a detailed documentation about FLAML [here](https://microsoft.github.io/FLAML/) where you can find the API documentation, use cases and examples.

flaml/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
from flaml.automl import AutoML, logger_formatter
33
from flaml.tune.searcher import CFO, BlendSearch, FLOW2, BlendSearchTuner, RandomSearch
44
from flaml.onlineml.autovw import AutoVW
5-
from flaml.integrations import oai
5+
from flaml.autogen import oai
66
from flaml.version import __version__
77

88

flaml/autogen/code_utils.py

Lines changed: 181 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,181 @@
1+
import signal
2+
import subprocess
3+
import sys
4+
from typing import List, Dict, Tuple, Optional, Union, Callable
5+
from flaml import oai
6+
7+
8+
def timeout_handler(signum, frame):
9+
raise TimeoutError("Timed out!")
10+
11+
12+
def execute_code(code: str, max_exec_time: Optional[int] = 3):
13+
signal.signal(signal.SIGALRM, timeout_handler)
14+
code = code.strip()
15+
with open("codetest.py", "w") as fout:
16+
fout.write(code)
17+
try:
18+
signal.alarm(max_exec_time)
19+
result = subprocess.run(
20+
[sys.executable, "codetest.py"],
21+
stdout=subprocess.DEVNULL,
22+
stderr=subprocess.PIPE,
23+
)
24+
signal.alarm(0)
25+
except TimeoutError:
26+
return 0
27+
return int(result.returncode == 0)
28+
29+
30+
def generate_assertions(
31+
definition: str, model: Optional[str] = "gpt-3.5-turbo"
32+
) -> Tuple[str, float]:
33+
"""Generate assertions for a function.
34+
35+
Args:
36+
definition (str): The function definition, including the signature and docstr.
37+
model (str): The model used for generation.
38+
39+
Returns:
40+
str: The generated assertions.
41+
float: The cost of the generation.
42+
"""
43+
prompt = """Given the signature and docstring, write the exactly same number of assertion(s) for the provided example(s) in the docstring, without assertion messages.
44+
45+
func signature:
46+
{definition}
47+
assertions:"""
48+
response = oai.Completion.create(
49+
{"definition": definition},
50+
model=model,
51+
prompt=prompt,
52+
max_tokens=256,
53+
stop="\n\n",
54+
)
55+
cost = oai.Completion.cost(model, response)
56+
assertions = oai.Completion.extract_text(response)[0]
57+
return assertions, cost
58+
59+
60+
def _remove_check(response):
61+
"""Remove the check function from the response."""
62+
# find the position of the check function
63+
pos = response.find("def check(")
64+
if pos == -1:
65+
return response
66+
return response[:pos]
67+
68+
69+
def eval_function_completions(
70+
responses: List[str],
71+
definition: str,
72+
test: Optional[str] = None,
73+
entry_point: Optional[str] = None,
74+
assertions: Optional[Union[str, Callable[[str], Tuple[str, float]]]] = None,
75+
) -> Dict:
76+
"""Select a response from a list of responses for the function completion task (using generated assertions), and/or evaluate if the task is successful using a gold test.
77+
78+
Args:
79+
responses (list): The list of responses.
80+
definition (str): The input definition.
81+
test (Optional, str): The test code.
82+
entry_point (Optional, str): The name of the function.
83+
assertions (Optional, str or Callable): The assertion code which serves as a filter of the responses, or an assertion generator.
84+
When provided, only the responses that pass the assertions will be considered for the actual test (if provided).
85+
86+
Returns:
87+
dict: The success metrics.
88+
"""
89+
n = len(responses)
90+
if assertions is None:
91+
# no assertion filter
92+
success_list = []
93+
for i in range(n):
94+
response = _remove_check(responses[i])
95+
code = (
96+
f"{response}\n{test}\ncheck({entry_point})"
97+
if response.startswith("def")
98+
else f"{definition}{response}\n{test}\ncheck({entry_point})"
99+
)
100+
success = execute_code(code)
101+
success_list.append(success)
102+
return {
103+
"expected_success": 1 - pow(1 - sum(success_list) / n, n),
104+
"success": any(s for s in success_list),
105+
}
106+
if callable(assertions) and n > 1:
107+
# assertion generator
108+
assertions, gen_cost = assertions(definition)
109+
else:
110+
gen_cost = 0
111+
if n > 1 or test is None:
112+
for i in range(n):
113+
response = responses[i] = _remove_check(responses[i])
114+
code = (
115+
f"{response}\n{assertions}"
116+
if response.startswith("def")
117+
else f"{definition}{response}\n{assertions}"
118+
)
119+
succeed_assertions = execute_code(code)
120+
if succeed_assertions:
121+
break
122+
else:
123+
# just test, no need to check assertions
124+
succeed_assertions = False
125+
i, response = 0, responses[0]
126+
if test is None:
127+
# no test code
128+
return {
129+
"index_selected": i,
130+
"succeed_assertions": succeed_assertions,
131+
"gen_cost": gen_cost,
132+
"assertions": assertions,
133+
}
134+
code_test = (
135+
f"{response}\n{test}\ncheck({entry_point})"
136+
if response.startswith("def")
137+
else f"{definition}{response}\n{test}\ncheck({entry_point})"
138+
)
139+
success = execute_code(code_test)
140+
return {
141+
"index_selected": i,
142+
"succeed_assertions": succeed_assertions,
143+
"success": success,
144+
"gen_cost": gen_cost,
145+
"assertions": assertions,
146+
}
147+
148+
149+
def implement(
150+
definition: str,
151+
configs: List[Dict],
152+
assertions: Optional[
153+
Union[str, Callable[[str], Tuple[str, float]]]
154+
] = generate_assertions,
155+
) -> Tuple[str, float]:
156+
"""Implement a function from a definition.
157+
158+
Args:
159+
definition (str): The function definition, including the signature and docstr.
160+
configs (list): The list of configurations for completion.
161+
assertions (Optional, str or Callable): The assertion code which serves as a filter of the responses, or an assertion generator.
162+
163+
Returns:
164+
str: The implementation.
165+
float: The cost of the implementation.
166+
int: The index of the configuration which generates the implementation.
167+
"""
168+
cost = 0
169+
if len(configs) > 1 and callable(assertions):
170+
assertions, cost = assertions(definition)
171+
for i, config in enumerate(configs):
172+
response = oai.Completion.create({"definition": definition}, **config)
173+
cost += oai.Completion.cost(config["model"], response)
174+
responses = oai.Completion.extract_text(response)
175+
metrics = eval_function_completions(
176+
responses, definition, assertions=assertions
177+
)
178+
assertions = metrics["assertions"]
179+
cost += metrics["gen_cost"]
180+
if metrics["succeed_assertions"] or i == len(configs) - 1:
181+
return responses[metrics["index_selected"]], cost, i

0 commit comments

Comments
 (0)