656 Commits

Author SHA1 Message Date
Yiran Wu
e3ca95bf8a
An agent implementation of MathChat (#1090)
* mathcaht implementation

* code forrmat

* update readme

* update openai.yml

* update openai.yml

* update openai.yml
2023-06-25 13:49:34 +00:00
EgorKraevTransferwise
5245efbd2c
Factor out time series-related functionality into a time series Task object (#989)
* Refactor into automl subpackage

Moved some of the packages into an automl subpackage to tidy before the
task-based refactor. This is in response to discussions with the group
and a comment on the first task-based PR.

Only changes here are moving subpackages and modules into the new
automl, fixing imports to work with this structure and fixing some
dependencies in setup.py.

* Fix doc building post automl subpackage refactor

* Fix broken links in website post automl subpackage refactor

* Fix broken links in website post automl subpackage refactor

* Remove vw from test deps as this is breaking the build

* Move default back to the top-level

I'd moved this to automl as that's where it's used internally, but had
missed that this is actually part of the public interface so makes sense
to live where it was.

* Re-add top level modules with deprecation warnings

flaml.data, flaml.ml and flaml.model are re-added to the top level,
being re-exported from flaml.automl for backwards compatability. Adding
a deprecation warning so that we can have a planned removal later.

* Fix model.py line-endings

* WIP

* WIP - Notes below

Got to the point where the methods from AutoML are pulled to
GenericTask. Started removing private markers and removing the passing
of automl to these methods. Done with decide_split_type, started on
prepare_data. Need to do the others after

* Re-add generic_task

* Most of the merge done, test_forecast_automl fit succeeds, fails at predict()

* Remaining fixes - test_forecast.py passes

* Comment out holidays-related code as it's not currently used

* Further holidays cleanup

* Fix imports in a test

* tidy up validate_data in time series task

* Test fixes

* Fix tests: add Task.__str__

* Fix tests: test for ray.ObjectRef

* Hotwire TS_Sklearn wrapper to fix test fail

* Attempt at test fix

* Fix test where val_pred_y is a list

* Attempt to fix remaining tests

* Push to retrigger tests

* Push to retrigger tests

* Push to retrigger tests

* Push to retrigger tests

* Remove plots from automl/test_forecast

* Remove unused data size field from Task

* Fix import for CLASSIFICATION in notebook

* Monkey patch TFT to avoid plotting, to fix tests on MacOS

* Monkey patch TFT to avoid plotting v2, to fix tests on MacOS

* Monkey patch TFT to avoid plotting v2, to fix tests on MacOS

* Fix circular import

* remove redundant code in task.py post-merge

* Fix test: set svd_solver="full" in PCA

* Update flaml/automl/data.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* Fix review comments

* Fix task -> str in custom learner constructor

* Remove unused CLASSIFICATION imports

* Hotwire TS_Sklearn wrapper to fix test fail by setting
optimizer_for_horizon == False

* Revert changes to the automl_classification and pin FLAML version

* Fix imports in reverted notebook

* Fix FLAML version in automl notebooks

* Fix ml.py line endings

* Fix CLASSIFICATION task import in automl_classification notebook

* Uncomment pip install in notebook and revert import

Not convinced this will work because of installing an older version of
the package into the environment in which we're running the tests, but
let's see.

* Revert c6a5dd1a0

* Fix get_classification_objective import in suggest.py

* Remove hcrystallball docs reference in TS_Sklearn

* Merge markharley:extract-task-class-from-automl into this

* Fix import, remove smooth.py

* Fix dependencies to fix TFT fail on Windows Python 3.8 and 3.9

* Add tensorboardX dependency to fix TFT fail on Windows Python 3.8 and 3.9

* Set pytorch-lightning==1.9.0 to fix  TFT fail on Windows Python 3.8 and 3.9

* Set pytorch-lightning==1.9.0 to fix  TFT fail on Windows Python 3.8 and 3.9

* Disable PCA reduction of lagged features for now, to fix svd convervence fail

* Merge flaml/main into time_series_task

* Attempt to fix formatting

* Attempt to fix formatting

* tentatively implement holt-winters-no covariates

* fix forecast method, clean class

* checking external regressors too

* update test forecast

* remove duplicated test file, re-add sarimax, search space cleanup

* Update flaml/automl/model.py

removed links. Most important one probably was: https://robjhyndman.com/hyndsight/ets-regressors/

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* prevent short series

* add docs

* First attempt at merging Holt-Winters

* Linter fix

* Add holt-winters to TimeSeriesTask.estimators

* Fix spark test fail

* Attempt to fix another spark test fail

* Attempt to fix another spark test fail

* Change Black max line length to 127

* Change Black max line length to 120

* Add logging for ARIMA params, clean up time series models inheritance

* Add more logging for missing ARIMA params

* Remove a meaningless test causing a fail, add stricter check on ARIMA params

* Fix a bug in HoltWinters

* A pointless change to hopefully trigger the on and off KeyError in ARIMA.fit()

* Fix formatting

* Attempt to fix formatting

* Attempt to fix formatting

* Attempt to fix formatting

* Attempt to fix formatting

* Add type annotations to _train_with_config() in state.py

* Add type annotations to prepare_sample_train_data() in state.py

* Add docstring for time_col argument of AutoML.fit()

* Address @sonichi's comments on PR

* Fix formatting

* Fix formatting

* Reduce test time budget

* Reduce test time budget

* Increase time budget for the test to pass

* Remove redundant imports

* Remove more redundant imports

* Minor fixes of points raised by Qingyun

* Try to fix pandas import fail

* Try to fix pandas import fail, again

* Try to fix pandas import fail, again

* Try to fix pandas import fail, again

* Try to fix pandas import fail, again

* Try to fix pandas import fail, again

* Try to fix pandas import fail, again

* Try to fix pandas import fail, again

* Try to fix pandas import fail, again

* Try to fix pandas import fail, again

* Try to fix pandas import fail, again

* Formatting fixes

* More formatting fixes

* Added test that loops over TS models to ensure coverage

* Fix formatting issues

* Fix more formatting issues

* Fix random fail in check

* Put back in tests for ARIMA predict without fit

* Put back in tests for lgbm

* Update test/test_model.py

cover dedup

* Match target length to X length in missing test

---------

Co-authored-by: Mark Harley <mark.harley@transferwise.com>
Co-authored-by: Mark Harley <mharley.code@gmail.com>
Co-authored-by: Qingyun Wu <qingyun.wu@psu.edu>
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
Co-authored-by: Andrea W <a.ruggerini@ammagamma.com>
Co-authored-by: Andrea Ruggerini <nescio.adv@gmail.com>
Co-authored-by: Egor Kraev <Egor.Kraev@tw.com>
Co-authored-by: Li Jiang <bnujli@gmail.com>
2023-06-19 11:20:32 +00:00
Chi Wang
8760631349
string to array (#1086)
* string to array

* exclude aoai
2023-06-17 13:11:22 +00:00
Chi Wang
e1da7f7d68
update openai model support (#1082)
* update openai model support

* new gpt3.5

* docstr

* function_call and content may co-exist

* test function call

---------

Co-authored-by: Qingyun Wu <qingyun.wu@psu.edu>
2023-06-16 00:58:44 +00:00
Chi Wang
0b739b8c93
Links to papers (#1084)
* Links to papers

* Update website/docs/Use-Cases/Auto-Generation.md

Co-authored-by: Qingyun Wu <qingyun.wu@psu.edu>

---------

Co-authored-by: Qingyun Wu <qingyun.wu@psu.edu>
2023-06-15 09:56:24 +00:00
Qingyun Wu
0c7082c7bf
Docmentation for agents (#1057)
* add agent notebook and documentation

* fix bug

* set flush to True when printing msg in agent

* add a math problem in agent notebook

* remove

* header

* improve notebook doc

* notebook update

* improve notebook example

* improve doc

* improve notebook doc

* improve print

* doc

* human_input_mode

* human_input_mode str

* indent

* indent

* Update flaml/autogen/agent/user_proxy_agent.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* Update notebook/autogen_agent.ipynb

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* Update notebook/autogen_agent.ipynb

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* Update notebook/autogen_agent.ipynb

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* add agent doc

* del old files

* remove chat

* agent doc

* remove chat_agent

* naming

* improve documentation

* wording

* improve agent doc

* wording

* general auto reply

* update agent doc

* human input mode

* add agent figure

* update agent figure

* update agent example figure

* update code example

* extensibility of UserProxyAgent

---------

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2023-06-14 23:56:13 +00:00
Li Jiang
3874a429cf
fix workflow (#1071) 2023-06-14 06:44:30 +00:00
Qingyun Wu
9356a92ba5
add pands requirement in benchmark option (#1070)
Co-authored-by: Li Jiang <bnujli@gmail.com>
2023-06-13 08:25:41 +00:00
Chi Wang
c5dfb03f0e
encode timeout msg in bytes (#1078)
* encode timeout msg in bytes

* fix msg and test
2023-06-12 18:07:14 +00:00
Chi Wang
a30d198530
Fix documentation (#1075)
* Fix indentation in documentation

* newline

* version
2023-06-11 01:03:49 +00:00
Chi Wang
5387a0a607
Agent notebook example with human feedback; Support shell command and multiple code blocks; Improve the system message for assistant agent; Improve utility functions for config lists; reuse docker image (#1056)
* add agent notebook and documentation

* fix bug

* set flush to True when printing msg in agent

* add a math problem in agent notebook

* remove

* header

* improve notebook doc

* notebook update

* improve notebook example

* improve doc

* agent notebook example with user feedback

* log

* log

* improve notebook doc

* improve print

* doc

* human_input_mode

* human_input_mode str

* indent

* indent

* Update flaml/autogen/agent/user_proxy_agent.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* shell command and multiple code blocks

* Update notebook/autogen_agent.ipynb

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* Update notebook/autogen_agent.ipynb

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* Update notebook/autogen_agent.ipynb

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* coding agent

* math notebook

* renaming and doc format

* typo

* infer lang

* sh

* docker

* docker

* reset consecutive autoreply counter

* fix explanation

* paper talk

* human feedback

* web info

* rename test

* config list explanation

* link to blogpost

* installation

* homepage features

* features

* features

* rename agent

* remove notebook

* notebook test

* docker command

* notebook update

* lang -> cmd

* notebook

* make it work for gpt-3.5

* return full log

* quote

* docker

* docker

* docker

* docker

* docker

* docker image list

* notebook

* notebook

* use_docker

* use_docker

* use_docker

* doc

* agent

* doc

* abs path

* pandas

* docker

* reuse docker image

* context window

* news

* print format

* pyspark version in py3.8

* pyspark in py3.8

* pyspark and ray

* quote

* pyspark

* pyspark

* pyspark

---------

Co-authored-by: Qingyun Wu <qingyun.wu@psu.edu>
2023-06-09 18:40:04 +00:00
Li Jiang
d36b2afe7f
suppress warning message of pandas_on_spark to_spark (#1058) 2023-06-01 16:04:01 +00:00
Li Jiang
b975df81da
Support more azure openai api_type (#1059) 2023-05-31 12:07:57 +00:00
Qingyun Wu
fed28e700b
add agent notebook and documentation (#1052)
* add agent notebook and documentation

* fix bug

* set flush to True when printing msg in agent

* add a math problem in agent notebook

* remove

* header

* improve notebook doc

* notebook update

* improve notebook example

* improve doc

* improve notebook doc

* improve print

* doc

* human_input_mode

* human_input_mode str

* indent

* indent

* Update flaml/autogen/agent/user_proxy_agent.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* Update notebook/autogen_agent.ipynb

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* Update notebook/autogen_agent.ipynb

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* Update notebook/autogen_agent.ipynb

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* renaming and doc format

* typo

---------

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2023-05-28 03:17:23 +00:00
Qingyun Wu
3e6e834bbb
remove redundant doc and add tutorial (#1004)
* remove redundant doc and add tutorial

* add demos for pydata2023

* Update pydata23 docs

* remove redundant notebooks

* Move tutorial notebooks to notebook folder

* update readme and notebook links

* update notebook links

* update links

* update readme

---------

Co-authored-by: Li Jiang <lijiang1@microsoft.com>
Co-authored-by: Li Jiang <bnujli@gmail.com>
2023-05-27 02:59:51 +00:00
Chi Wang
b90e9ee283
doc and test update (#1053)
* doc and test update

* docker update
2023-05-26 20:24:30 +00:00
badjouras
9e2eb200df
docs: 📝 Fix link to installation section in Task-Oriented-AutoML.md (#1051)
Co-authored-by: rcardoso <ricardo.cardoso@anova.com>
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2023-05-25 02:23:54 +00:00
Chi Wang
a0b318b12e
create an automl option to remove unnecessary dependency for autogen and tune (#1007)
* version update post release v1.2.2

* automl option

* import pandas

* remove automl.utils

* default

* test

* type hint and version update

* dependency update

* link to open in colab

* use packging.version to close #725

---------

Co-authored-by: Li Jiang <lijiang1@microsoft.com>
Co-authored-by: Li Jiang <bnujli@gmail.com>
2023-05-24 23:55:04 +00:00
Chi Wang
e9fdbc6e02
Improve messaging in documentation (#1050)
* Improve messaging in documentation

* doc

* improve wording in blogpost

---------

Co-authored-by: Qingyun Wu <qingyun.wu@psu.edu>
2023-05-24 21:28:52 +00:00
Chi Wang
9977a7aae1
Blogpost for adaptation in HumanEval (#1048)
* Blogpost for adaptation in HumanEval

* doc

* fix link

* fix link

* explain

* model

* interface

* link

* typo

* doc
2023-05-23 04:22:15 +00:00
Chi Wang
e463146cb8
response filter (#1039)
* response filter

* rewrite implement based on the filter

* multi responses

* abs path

* code handling

* option to not use docker

* context

* eval_only -> raise_error

* notebook

* utils

* utils

* separate tests

* test

* test

* test

* test

* test

* test

* test

* test

* **config in test()

* test

* test

* filename
2023-05-21 22:22:29 +00:00
Li Jiang
7de4eb347d
Fix PULL_REQUEST_TEMPLATE and improve test by removing unnecessary environment variable (#1043)
* Improve test by removing unnecessary environment variable

* Fix PULL_REQUEST_TEMPLATE

* Hide pre-commit check

* remove the checkbox for pre-commit

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

---------

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2023-05-19 20:05:14 +00:00
Pratyay Roy
683f6befd2
updated search space (#1044)
Co-authored-by: Pratyay Roy <63900765+pratyay-roy@users.noreply.github.com>
2023-05-17 22:36:41 +00:00
Qingyun Wu
a1f51d1d23
Blogpost (#1026)
* add 1m milestone blogpost

* format issues

* update subsection title

* acknowledgement

* Update website/blog/2023-05-07-1M-milestone/index.mdx

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* Update website/blog/2023-05-07-1M-milestone/index.mdx

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* update blogpost

* collaborators

* wording

* Azure Data to Azure Synapse

* name

* Azure Synapse Analytics

* tasks and search space

* Update website/blog/2023-05-07-1M-milestone/index.mdx

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

---------

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
Co-authored-by: Li Jiang <bnujli@gmail.com>
2023-05-17 03:49:19 +00:00
Chi Wang
0e2dbd5378
fix of website link (#1042) 2023-05-16 06:18:33 +00:00
Qingyun Wu
2e43509690
Human agent (#1025)
* add human agent and chat agent

* feedback msg

* clean print

* remove redundant import

* make coding agent work

* import check

* terminate condition

* rename

* add docstr

* exitcode to str

* print

* save and execute code

* add max_turn_num

* add max_turn_num in test_agent.py

* reduce max_turn_num in the test

* change max_turn_num to max_consecutive_auto_reply

* update human proxy agent

* remove execution agent and dated docstr

* clean doc

* add back work_dir

* add is_termination_msg when mode is NEVER

* revise stop condition

* remove work_dir in coding agent

* human_proxy_agent docstr

* auto_reply

* clean auto_reply

---------

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2023-05-16 00:37:38 +00:00
Susan Xueqing Liu
f01acb67f6
update model of text summarization (#1030) 2023-05-10 00:48:22 +00:00
Chi Wang
59e882e5cc
chat completion check (#1024)
* chat completion check

* add test

* doc

* timeout

* bump version to 1.2.4
2023-05-09 20:39:46 +00:00
Beibin Li
51c8768bcf
Catch AuthenticationError trying different configs (#1023)
* Catch AuthenticationError trying different configs
While trying different openai `config_list`, some
configs might be outdated (e.g., an API key is expired).
In these cases, we don't want the program to crash.
Instead, we might want to try other configs.

* Lint whitespace
2023-05-06 11:16:50 +00:00
Chi Wang
b3fba9734e
Mark experimental classes; doc; multi-config trial (#1021)
* Mark experimental classes

* template

* multi model

* test

* multi-config doc

* doc

* doc

* test

---------

Co-authored-by: Li Jiang <bnujli@gmail.com>
2023-05-05 02:48:31 +00:00
Li Jiang
8b2411b219
update spark session in spark tests (#1006)
* add mlflow and spark integration tests

* remove unused params

* remove mlflow tests
2023-05-03 09:59:29 +00:00
Li Jiang
fd1f36597b
update max_spark_parallelism to fit in auto-scale spark cluster (#1008)
* update max_spark_parallelism to fit in auto-scale spark cluster

* update test
2023-05-03 09:16:32 +00:00
Susan Xueqing Liu
00c30a398e
fix NLP zero division error (#1009)
* fix NLP zero division error

* set predictions to None

* set predictions to None

* set predictions to None

* refactor

* refactor

---------

Co-authored-by: Li Jiang <lijiang1@microsoft.com>
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
Co-authored-by: Li Jiang <bnujli@gmail.com>
2023-05-03 05:50:28 +00:00
garar
31864d2d77
Add mlflow_logging param (#1015)
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2023-05-03 03:09:04 +00:00
Chi Wang
19aee67f55
coding agent; logging (#1011)
* coding agent

* tsp

* tsp

* aoai

* logging

* compact

* Handle Import Error

* cost function

* reset counter; doc

* reset_counter

* home page update

* use case

* catboost in linux

* catboost

* catboost

* catboost

* doc

* intro

* catboost
2023-05-02 20:38:23 +00:00
Li Jiang
39b9a9a417
Fix catboost failure in mac-os python<3.9 (#1020) 2023-05-02 14:19:56 +00:00
Chi Wang
6d7fb3d786
raise content_filter error (#1018)
* raise content_filter error

* import error handling
2023-04-29 18:46:28 +00:00
Qingyun Wu
06cd3f52e5
update readme (#1014)
Co-authored-by: Li Jiang <lijiang1@microsoft.com>
2023-04-28 06:38:09 +00:00
Jirka Borovec
73bb6e7667
pyproject.toml & switch to Ruff (#976)
* unify config to pyproject.toml
replace flake8 with Ruff

* drop configs

* update

* fixing

* Apply suggestions from code review

Co-authored-by: Zvi Baratz <z.baratz@gmail.com>

* setup

* ci

* pr template

* reword

---------

Co-authored-by: Zvi Baratz <z.baratz@gmail.com>
Co-authored-by: Li Jiang <lijiang1@microsoft.com>
2023-04-28 01:54:55 +00:00
Anupam
a8752b6aa0
fixed sentence misplace #998 (#1010) 2023-04-26 15:07:33 +00:00
Sayan Roy
e9cd6a058c
fixing the typo #990 (#994)
* fixing the typo #990

* Update website/docs/Use-Cases/Auto-Generation.md

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* removing extra space : Update website/docs/Use-Cases/Auto-Generation.md

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* Update website/docs/Use-Cases/Auto-Generation.md

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* Update website/docs/Use-Cases/Auto-Generation.md

---------

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2023-04-26 05:48:09 +00:00
Chi Wang
f097c20f86
version update post release v1.2.2 (#1005) 2023-04-25 04:48:17 +00:00
Chi Wang
fa5ccea862
extract code from text; solve_problem; request_timeout in config; improve code (#999)
* extract code from text

* solve_problem; request_timeout in config

* improve

* move import statement

* improve code

* generate assertions

* constant

* configs for implement; voting

* doc

* execute code in docker

* success indicator of code executation in docker

* success indicator

* execute code

* strip n

* add cost in generate_code

* add docstr

* filename

* bytes

* check docker version

* print log

* python test

* remove api key address

* rename exit code

* success exit code

* datasets

* exit code

* recover openai tests

* cache and pattern match

* wait

* wait

* cache and test

* timeout test

* python image name and skip macos

* windows image

* docker images

* volume path and yaml

* win path -> posix

* extensions

* path

* path

* path

* path

* path

* path

* path

* path

* path

* path

* path

* skip windows

* path

* timeout in windows

* use_docker

* use_docker

* hot fix from #1000

---------

Co-authored-by: Qingyun Wu <qingyun.wu@psu.edu>
2023-04-23 11:50:29 +00:00
Susan Xueqing Liu
7114b8f742
fix zerodivision (#1000)
* fix zerodivision

* update

* remove final

---------

Co-authored-by: Li Jiang <lijiang1@microsoft.com>
2023-04-23 03:55:51 +00:00
Chi Wang
da0d8c05e1
Blog post for LLM tuning (#986)
* outline

* revision

* eval function signature

* first draft

* link

* format

* example

* cleanup

* average

* move figure

* tldr

* bold

* bold

* tag
2023-04-22 04:41:16 +00:00
Susan Xueqing Liu
99bb0a8425
update nlp notebook (#940)
* update nlp notebook

* rerun

* rerun

* removing redundant in notebook

* remove redundant content in nlp notebook

* update notebook

* update plot

* update plot

* update plot

---------

Co-authored-by: Li Jiang <lijiang1@microsoft.com>
2023-04-17 17:29:36 +00:00
Chi Wang
d4070e24c1
make context optional; improve error handling and doc (#997)
* make context optional

* better error handling and doc

* skip instantiation if no context

* skip test
2023-04-16 21:18:32 +00:00
Jane Illarionova
b235fe0098
Expose feature and label transformer in automl.py (#993)
* expose label and feature transformer

* linter apply

* avoid undefined attribute in flaml/automl/automl.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* avoid undefined attribute in flaml/automl/automl.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* retrigger checks

* retrigger checks

---------

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2023-04-15 19:06:47 +00:00
Li Jiang
c9fc622af1
fix tests failure caused by version incompatibility (#995) 2023-04-15 14:52:40 +00:00
Chi Wang
c780d79004
Post release update (#985)
* news update

* doc update

* avoid KeyError

* bump version to 1.2.1

* handle empty responses

* typo

* eval function
2023-04-10 20:46:28 +00:00