* math utils in autogen
* cleanup
* code utils
* remove check function from code response
* comment out test
* GPT-4
* increase request timeout
* name
* logging and error handling
* better doc
* doc
* codegen optimized
* GPT series
* text
* no demo example
* math
* import openai
* import openai
* azure model name
* azure model name
* openai version
* generate assertion if necessary
* condition to generate assertions
* init region key
* rename
* comments about budget
* prompt
---------
Co-authored-by: Susan Xueqing Liu <liususan091219@users.noreply.github.com>
* add basic support to Spark dataframe
add support to SynapseML LightGBM model
update to pyspark>=3.2.0 to leverage pandas_on_Spark API
* clean code, add TODOs
* add sample_train_data for pyspark.pandas dataframe, fix bugs
* improve some functions, fix bugs
* fix dict change size during iteration
* update model predict
* update LightGBM model, update test
* update SynapseML LightGBM params
* update synapseML and tests
* update TODOs
* Added support to roc_auc for spark models
* Added support to score of spark estimator
* Added test for automl score of spark estimator
* Added cv support to pyspark.pandas dataframe
* Update test, fix bugs
* Added tests
* Updated docs, tests, added a notebook
* Fix bugs in non-spark env
* Fix bugs and improve tests
* Fix uninstall pyspark
* Fix tests error
* Fix java.lang.OutOfMemoryError: Java heap space
* Fix test_performance
* Update test_sparkml to test_0sparkml to use the expected spark conf
* Remove unnecessary widgets in notebook
* Fix iloc java.lang.StackOverflowError
* fix pre-commit
* Added params check for spark dataframes
* Refactor code for train_test_split to a function
* Update train_test_split_pyspark
* Refactor if-else, remove unnecessary code
* Remove y from predict, remove mem control from n_iter compute
* Update workflow
* Improve _split_pyspark
* Fix test failure of too short training time
* Fix typos, improve docstrings
* Fix index errors of pandas_on_spark, add spark loss metric
* Fix typo of ndcgAtK
* Update NDCG metrics and tests
* Remove unuseful logger
* Use cache and count to ensure consistent indexes
* refactor for merge maain
* fix errors of refactor
* Updated SparkLightGBMEstimator and cache
* Updated config2params
* Remove unused import
* Fix unknown parameters
* Update default_estimator_list
* Add unit tests for spark metrics
* improve max_valid_n and doc
* Update README.md
Co-authored-by: Li Jiang <lijiang1@microsoft.com>
* add support for chatgpt
* notebook
* newline at end of file
* chatgpt notebook
* ChatGPT in Azure
* doc
* math
* warning, timeout, log file name
* handle import error
* doc update; default value
* paper
* doc
* docstr
* eval_func
* prompt and messages
* remove confusing words
* notebook name
---------
Co-authored-by: Li Jiang <lijiang1@microsoft.com>
Co-authored-by: Susan Xueqing Liu <liususan091219@users.noreply.github.com>
* improve max_valid_n and doc
* Update README.md
Co-authored-by: Li Jiang <lijiang1@microsoft.com>
* newline at end of file
* doc
---------
Co-authored-by: Li Jiang <lijiang1@microsoft.com>
Co-authored-by: Susan Xueqing Liu <liususan091219@users.noreply.github.com>
Co-authored-by: Qingyun Wu <qingyun.wu@psu.edu>
* add cost budget; move loc of make_dir
* support openai completion
* install pytest in workflow
* skip openai test
* test openai
* path for docs rebuild
* install datasets
* signal
* notebook
* notebook in workflow
* optional arguments and special params
* key -> k
* improve readability
* assumption
* optimize for model selection
* larger range of max_tokens
* notebook
* python package workflow
* skip on win
* notebook test
* add ipykernel, remove except
* only create dir if not empty
* Stop sequential tuning when result is None
* fix reproducibility of global search
* save gs seed
* use get to avoid KeyError
* test
* make performance test reproducible
* fix test error
* Doc update and disable logging
* document random_state and version
* remove hardcoded budget
* fix test error and dependency; close#777
* iloc
* install editable package in codespace
* fix test error in test_forecast
* fix test error in test_space
* openml version
* break tests; pre-commit
* skip on py10+win32
* install mlflow in test
* install mlflow in [test]
* skip test in windows
* import
* handle PermissionError
* skip test in windows
* skip test in windows
* skip test in windows
* skip test in windows
* remove ts_forecast_panel from doc
* add vw version requirement
* vw version
* version range
* add documentation
* vw version range
* skip test on py3.10
* vw version
* rephrase
* don't install vw on py 3.10
* move import location
* remove inherit
* 3.10 in version
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
* time series forecasting with panel datasets
- integrate Temporal Fusion Transformer as a learner based on pytorchforecasting
Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>
* update setup.py
Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>
* update test_forecast.py
Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>
* update setup.py
Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>
* update setup.py
Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>
* update model.py and test_forecast.py
- remove blank lines
Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>
* update model.py to prevent errors
Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>
* update automl.py and data.py
- change forecast task name
- update documentation for fit() method
Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>
* update test_forecast.py
Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>
* update test_forecast.py
- add performance test
- use 'fit_kwargs_by_estimator'
Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>
* add time index function
Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>
* update test_forecast.py performance test
Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>
* update data.py
Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>
* update automl.py
Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>
* update data.py to prevent type error
Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>
* update setup.py
Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>
* update for pytorch forecasting tft on panel datasets
Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>
* update automl.py documentations
Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>
* - rename estimator
- add 'gpu_per_trial' for tft estimator
Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>
* update test_forecast.py
Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>
* include ts panel forecasting as an example
Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>
* update model.py
Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>
* update documentations
Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>
* update automl_time_series_forecast.ipynb
Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>
* update documentations
Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>
* "weights_summary" argument deprecated and removed for pl.Trainer()
Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>
* update model.py tft estimator prediction method
Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>
* update model.py
Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>
* update `fit_kwargs` documentation
Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>
* update automl.py
Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>
Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
* support latest xgboost version
* Update test_classification.py
* Update
Exists problems when installing xgb1.6.1 in py3.6
* cleanup
* xgboost version
* remove time_budget_s in test
* remove redundancy
* stop support of python 3.6
Co-authored-by: zsk <shaokunzhang529@gmail.com>
Co-authored-by: Qingyun Wu <qingyun.wu@psu.edu>
Issue I encountered:
#542 run test_restore.py and got _pickle.UnpicklingError: state is not a dictionary
I observed:
1. numpy version
i. When numpy==1.16*, np.random.RandomState.__getstate__() returns a tuple, not a dict.
_pickle.UnpicklingError occurs
ii. When numpy>1.17.0rc1, it returns a dict;
_pickle.UnpicklingError does not occur
iii. When numpy>1.17.0rc1, flaml uses np_random_generator = np.random.Generator,
_pickle.UnpicklingError does not occur
2. class _BackwardsCompatibleNumpyRng
When I remove func _BackwardsCompatibleNumpyRng.__getattr__() , _pickle.UnpicklingError doesn't occur (regardless of numpy version == 1.16* or 1.17*)
To sum up,
I think making modifications to class _BackwardsCompatibleNumpyRng is not a good choice (_BackwardsCompatibleNumpyRng came from ray)and we still need to learn more about the operation mechanism of pickle.
So I upgraded the numpy version that flaml requires:
setup.py:"NumPy>=1.17.0rc1"
* add sklearn regressors as learners for ts_forecast task
* add direct forecasting strategy
warnings and errors for duplicate rows and missing values
- add preprocess for sklearn time series forecast
update automl.py
update test/test_forecast.py
* update model.py and test_forecast.py for cv eval_method
* add "hcrystalball" dependency in setup.py
* update automl.py
- add _validate_ts_data function for abstraction
- include xgb_limitdepth as a learner
* update model.py
- update search space for sklearn ts regressors
* update automl.py and test_forecast.py for numpy array inputs
* add documentations to model.py
* add documentation for removing catboost regressor
* update automl.py
- _validate_ts_data() function
Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>
* limit time and memory
* separate tests
* lrl1 can't be limited by limit_resource
* free memory when possible
* passthrough=False when ensemble fails;
retrain when trained_estimator is None
* use callback to for resource limit
* handle lower version of xgb with no callback
* free mem ratio
* reduce verbosity
* retrain_final when max_iter==1
* remove trained_estimator from result
* model_history
* wheel
* retrain time as best_config_train_time
* ci: libomp version for xgboost on macos
* limit_resource not working in windows
* test pickle load
* mute forecaster
* notebook update
* check hard
* preventive callback
* add use_ray
* Integrate multivariate time series forecasting, now supports
continuous and categorical variables
- update data.py to transform time series data
- update search space
- update documentations to reflect changes
- update test_forecast.py
- rename 'forecast' task to 'ts_forecast' task
* update automl.py and test_forecast.py
* update forecast notebook
* update README.md and setup.py
* update ml.py and test_forecast.py
- make "ds" and "y" constant variables
* replace constants with constant variables
* bump version to 0.7.0
* update setup.py
- support 'forecast' and 'ts_forecast'
* update automl.py and data.py
- support 'forecast' and 'ts_forecast' tasks
* warning -> info for low cost partial config
#195, #110
* when n_estimators < 0, use trained_estimator's
* log debug info
* test random seed
* remove "objective"; avoid ZeroDivisionError
* hp config to estimator params
* check type of searcher
* default n_jobs
* try import
* Update searchalgo_auto.py
* CLASSIFICATION
* auto_augment flag
* min_sample_size
* make catboost optional
* config in result
* value can be float
* pytorch notebook example
* docker, pre-commit
* max_failure (#192); early_stop
* extend starting_points (#196)
Co-authored-by: Chi Wang (MSR) <wang.chi@microsoft.com>
Co-authored-by: Qingyun Wu <qw2ky@virginia.edu>
* remove catboost training dir
* close#48
* bs for hierarchical space. close#85
* retrain for hierarchical space
* clean ml (#180)
Co-authored-by: Qingyun Wu <qxw5138@psu.edu>
* support ranking task
* examples
* cv shuffle
* forecast api and implementation cleaner
* period constraints
* delete groups after fit
* added 'forecast' task with estimators ['fbprophet', 'arima', 'sarimax']
* update setup.py
* add TimeSeriesSplit to 'regression' and 'classification' task
* add 'time' split_type for 'classification' and 'regression' task
Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>
* feature importance
* variable name
* Update test/test_split.py
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
* Update test/test_forecast.py
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
* prophet installation fail in windows
* upload flaml_forecast.ipynb
Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>
* subspace in flow2
* search space and trainable from AutoML
* experimental features: multivariate TPE, grouping, add_evaluated_points
* test experimental features
* readme
* define by run
* set time_budget_s for bs
Co-authored-by: liususan091219 <Xqq630517>
* version
* acl
* test define_by_run_func
* size
* constraints
Co-authored-by: Chi Wang <wang.chi@microsoft.com>