autogen

mirror of https://github.com/microsoft/autogen.git synced 2025-10-31 01:40:58 +00:00

Author	SHA1	Message	Date
Mark Harley	44ddf9e104	Refactor into automl subpackage (#809 ) * Refactor into automl subpackage Moved some of the packages into an automl subpackage to tidy before the task-based refactor. This is in response to discussions with the group and a comment on the first task-based PR. Only changes here are moving subpackages and modules into the new automl, fixing imports to work with this structure and fixing some dependencies in setup.py. * Fix doc building post automl subpackage refactor * Fix broken links in website post automl subpackage refactor * Fix broken links in website post automl subpackage refactor * Remove vw from test deps as this is breaking the build * Move default back to the top-level I'd moved this to automl as that's where it's used internally, but had missed that this is actually part of the public interface so makes sense to live where it was. * Re-add top level modules with deprecation warnings flaml.data, flaml.ml and flaml.model are re-added to the top level, being re-exported from flaml.automl for backwards compatability. Adding a deprecation warning so that we can have a planned removal later. * Fix model.py line-endings * Pin pytorch-lightning to less than 1.8.0 We're seeing strange lightning related bugs from pytorch-forecasting since the release of lightning 1.8.0. Going to try constraining this to see if we have a fix. * Fix the lightning version pin Was optimistic with setting it in the 1.7.x range, but that isn't compatible with python 3.6 * Remove lightning version pin * Revert dependency version changes * Minor change to retrigger the build * Fix line endings in ml.py and model.py Co-authored-by: Qingyun Wu <qingyun.wu@psu.edu> Co-authored-by: EgorKraevTransferwise <egor.kraev@transferwise.com>	2022-12-06 15:46:08 -05:00
Chi Wang	92b79221b6	make performance test reproducible (#837 ) * make performance test reproducible * fix test error * Doc update and disable logging * document random_state and version * remove hardcoded budget * fix test error and dependency; close #777 * iloc	2022-12-06 10:13:39 -08:00
Shreyas	3b3b0bfa8e	roc_auc_weighted metric addition (#827 ) * Pending changes exported from your codespace * Update flaml/automl.py Co-authored-by: Chi Wang <wang.chi@microsoft.com> * Update flaml/automl.py Co-authored-by: Chi Wang <wang.chi@microsoft.com> * Update flaml/ml.py Co-authored-by: Chi Wang <wang.chi@microsoft.com> * Update flaml/ml.py Co-authored-by: Chi Wang <wang.chi@microsoft.com> * Update website/docs/Examples/Integrate - Scikit-learn Pipeline.md Co-authored-by: Chi Wang <wang.chi@microsoft.com> * added documentation for new metric * Update flaml/ml.py Co-authored-by: Chi Wang <wang.chi@microsoft.com> * minor notebook changes * Update Integrate - Scikit-learn Pipeline.md * Update notebook/automl_classification.ipynb Co-authored-by: Chi Wang <wang.chi@microsoft.com> * Update integrate_azureml.ipynb Co-authored-by: Chi Wang <wang.chi@microsoft.com>	2022-12-02 19:27:32 -08:00
skzhang1	96f0688595	fix	2022-08-24 02:58:18 +00:00
skzhang1	50fb20ebbc	update	2022-08-21 17:44:26 +00:00
skzhang1	462c27f8ae	fix	2022-08-21 12:54:58 +00:00
skzhang1	e3aa7ea9d1	update	2022-08-21 12:52:27 +00:00
skzhang1	34085b8c25	update	2022-08-15 14:41:30 +00:00
skzhang1	a55ed0ed61	update	2022-08-13 18:56:46 +00:00
skzhang1	fc633ef15e	update	2022-08-13 18:51:33 +00:00
jmrichardson	e43485607a	Disable shuffle for custom CV (#659 ) * Disable shuffle for custom CV * Add custom fold shuffle test * Update test_split.py * Update test_split.py	2022-08-12 17:05:32 -07:00
Kevin Chen	f718d18b5e	time series forecasting with panel datasets (#541 ) * time series forecasting with panel datasets - integrate Temporal Fusion Transformer as a learner based on pytorchforecasting Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com> * update setup.py Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com> * update test_forecast.py Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com> * update setup.py Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com> * update setup.py Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com> * update model.py and test_forecast.py - remove blank lines Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com> * update model.py to prevent errors Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com> * update automl.py and data.py - change forecast task name - update documentation for fit() method Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com> * update test_forecast.py Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com> * update test_forecast.py - add performance test - use 'fit_kwargs_by_estimator' Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com> * add time index function Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com> * update test_forecast.py performance test Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com> * update data.py Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com> * update automl.py Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com> * update data.py to prevent type error Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com> * update setup.py Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com> * update for pytorch forecasting tft on panel datasets Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com> * update automl.py documentations Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com> * - rename estimator - add 'gpu_per_trial' for tft estimator Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com> * update test_forecast.py Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com> * include ts panel forecasting as an example Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com> * update model.py Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com> * update documentations Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com> * update automl_time_series_forecast.ipynb Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com> * update documentations Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com> * "weights_summary" argument deprecated and removed for pl.Trainer() Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com> * update model.py tft estimator prediction method Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com> * update model.py Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com> * update `fit_kwargs` documentation Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com> * update automl.py Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com> Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com> Co-authored-by: Chi Wang <wang.chi@microsoft.com>	2022-08-12 08:39:22 -07:00
skzhang1	e3c9da50da	update	2022-08-10 00:42:47 +00:00
skzhang1	e5a422c41e	update	2022-08-07 18:11:04 +00:00
Xueqing Liu	21fa6c10ec	Fixing the issue that FLAML trial number is significantly smaller than Transformers.hyperparameter_search (#657 ) * fix 636 * adding low cost config * update padding; update tokenization output y type (series -> DF); update low cost init config * updating todf; updating metric_loss_score	2022-08-03 00:11:29 -04:00
Xueqing Liu	6108493e0b	fix ner bug; refactor post processing of TransformersEstimator prediction (#615 ) * fix ner bug; refactor post processing * fix too many values to unpack * supporting id/token label for NER	2022-07-05 13:38:21 -04:00
Chi Wang	c45741a67b	support latest xgboost version (#599 ) * support latest xgboost version * Update test_classification.py * Update Exists problems when installing xgb1.6.1 in py3.6 * cleanup * xgboost version * remove time_budget_s in test * remove redundancy * stop support of python 3.6 Co-authored-by: zsk <shaokunzhang529@gmail.com> Co-authored-by: Qingyun Wu <qingyun.wu@psu.edu>	2022-06-21 18:59:07 -07:00
Xueqing Liu	7740cd3466	trying to fix the indexerror for ner (#596 ) * trying to fix the indexerror for ner	2022-06-16 14:58:23 -04:00
Xueqing Liu	e0e317bfb1	fixing trainable and update function, completing NOTE (#566 ) * fix checkpoint naming + trial id for non-ray mode, fix the bug in running test mode, delete all the checkpoints in non-ray mode * finished testing for checkpoint naming, delete checkpoint, ray, max iter = 1	2022-06-03 15:19:22 -04:00
Xueqing Liu	2a8decdc50	fix the post-processing bug in NER (#534 ) * fix conll bug * update DataCollatorForAuto * adding label_list comments	2022-05-10 17:22:57 -04:00
Xueqing Liu	ca35fa969f	refactoring TransformersEstimator to support default and custom_hp (#511 ) * refactoring TransformersEstimator to support default and custom_hp * handling starting_points not in search space * addressing starting point more than max_iter * fixing upper < lower bug	2022-04-28 14:06:29 -04:00
Xueqing Liu	5f97532986	adding evaluation (#495 ) * adding automl.score * fixing the metric name in train_with_config * adding pickle after score * fixing a bug in automl.pickle	2022-03-25 17:00:08 -04:00
Xueqing Liu	af423463c3	fixing bug for ner (#463 ) * fixing bug for ner * removing global var * adding class for trial counter * adding notebook * adding use_ray dict * updating documentation for nlp	2022-03-20 22:03:02 -04:00
Kevin Chen	81f54026c9	Support time series forecasting for discrete target variable (#416 ) * support 'ts_forecast_classification' task to forecast discrete values * update test_forecast.py - add test for forecasting discrete values * update test_model.py * pre-commit changes	2022-01-24 18:39:36 -08:00
Xueqing Liu	f41f1c2198	Logging multiple checkpoints (#394 )	2022-01-12 19:50:39 -08:00
Kevin Chen	d4273669e6	Time series forecasting with sklearn regressors (#362 ) * add sklearn regressors as learners for ts_forecast task * add direct forecasting strategy warnings and errors for duplicate rows and missing values - add preprocess for sklearn time series forecast update automl.py update test/test_forecast.py * update model.py and test_forecast.py for cv eval_method * add "hcrystalball" dependency in setup.py * update automl.py - add _validate_ts_data function for abstraction - include xgb_limitdepth as a learner * update model.py - update search space for sklearn ts regressors * update automl.py and test_forecast.py for numpy array inputs * add documentations to model.py * add documentation for removing catboost regressor * update automl.py - _validate_ts_data() function Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>	2022-01-06 23:12:38 -08:00
Xueqing Liu	207b6935d9	adding token classification (#376 ) * adding ner	2022-01-03 13:44:10 -05:00
Xueqing Liu	ee3162e232	Adding the NLP task summarization (#346 ) * Add test_autohf_summarization.py * adding seq2seq * Update flaml/nlp/huggingface/trainer.py * rouge metrics Co-authored-by: XinZofStevens <xzhao4346@gmail.com> Co-authored-by: JinzhuoWu <wujinzhuo0105@gmail.com> Co-authored-by: Chi Wang <wang.chi@microsoft.com>	2021-12-20 14:19:32 -08:00
Chia-Chi Hsu	671ccbbe3f	support for customized splitters (#333 ) * add support for customized splitters * use the param split_type for feeding generators * use single API for customized splitter and add test * when task==TS_FORCAST, always set shuffle=False * update docstr Co-authored-by: Chi Wang <wang.chi@microsoft.com>	2021-12-16 16:13:04 -08:00
Xueqing Liu	1a3e01c352	adding HF metrics (#335 ) * adding nlp metrics * fix ndcg	2021-12-10 12:32:49 -05:00
Chi Wang	18230ed22f	pred_time_limit clarification and logging (#319 ) * pred_time_limit clarification * log prediction time * handle ChunkedEncodingError in test	2021-12-03 16:02:00 -08:00
Chi Wang	85e21864ce	test -> val; docstr (#300 ) * rename test -> val in custom metric function * add an example in docstr resolve #299	2021-11-22 22:17:29 -08:00
Chi Wang	ea6d28d7bd	add max_depth to xgboost search space (#282 ) * add max_depth to xgboost search space * notebook update * two learners for xgboost (max_depth or max_leaves)	2021-11-22 21:17:48 -08:00
Xueqing Liu	42de3075e9	Make NLP tasks available from AutoML.fit() (#210 ) Sequence classification and regression: "seq-classification" and "seq-regression" Co-authored-by: Chi Wang <wang.chi@microsoft.com>	2021-11-16 11:06:20 -08:00
Chi Wang	5b0932e442	Unify regression and classification for XGBoost (#276 ) * scikit-learn API for XGBoostRegressor	2021-11-09 21:23:54 -08:00
Chi Wang	c4d5986ee8	no retraining when max_iter=0 and not retrain_full	2021-11-06 11:37:57 -07:00
Chi Wang	0d9439212f	update docstr	2021-11-06 09:37:33 -07:00
Kevin Chen	519bfc2a18	Integrate multivariate time series forecasting (#254 ) * Integrate multivariate time series forecasting, now supports continuous and categorical variables - update data.py to transform time series data - update search space - update documentations to reflect changes - update test_forecast.py - rename 'forecast' task to 'ts_forecast' task * update automl.py and test_forecast.py * update forecast notebook * update README.md and setup.py * update ml.py and test_forecast.py - make "ds" and "y" constant variables * replace constants with constant variables * bump version to 0.7.0 * update setup.py - support 'forecast' and 'ts_forecast' * update automl.py and data.py - support 'forecast' and 'ts_forecast' tasks	2021-10-30 09:48:57 -07:00
Chi Wang	7d6e860102	n_estimators for catboost	2021-10-18 21:56:21 -07:00
Chi Wang	f48ca2618f	warning -> info for low cost partial config (#231 ) * warning -> info for low cost partial config #195, #110 * when n_estimators < 0, use trained_estimator's * log debug info * test random seed * remove "objective"; avoid ZeroDivisionError * hp config to estimator params * check type of searcher * default n_jobs * try import * Update searchalgo_auto.py * CLASSIFICATION * auto_augment flag * min_sample_size * make catboost optional	2021-10-08 16:09:43 -07:00
Chi Wang	a99e939404	update config if n_estimators is modified (#225 ) * update config if n_estimators is modified * prediction as int * handle the case n_estimators <= 0 * if trained and no budget to train more, return the trained model * split_type=group for classification & regression	2021-09-27 21:30:49 -07:00
Chi Wang	16a97bec76	set converge flag when no trial can be sampled (#217 ) * set converge flag when no trial can be sampled * require custom_metric to return dict for logging close #218 * estimate time budget needed * log info per iteration	2021-09-23 10:49:02 -07:00
Chi Wang	f4529dfe89	package name in setup (#198 ) * package name * learning to rank example: close #200 * try import prophet #201	2021-09-11 21:19:18 -07:00
Chi Wang	e46573a01d	warmstart blendsearch (#186 ) * increase test coverage * use define by run only when needed * warmstart bs * classification -> binary, multi * warm start with evaluated rewards * data transformer; resource attr for gs * BlendSearchTuner bug fix and unittest * bug fix * docstr and import * task type	2021-09-04 01:42:21 -07:00
Qingyun Wu	5fdfa2559b	Cleanml (#185 ) * reorg ml * return y_pred in eval_estimator * add train loss into metric_for_logging dict	2021-09-02 13:07:30 -07:00
Chi Wang	6ab0730793	remove catboost training dir; ensemble api; blendsearch for hierarchical space; ranking task; forecast improvement (#178 ) * remove catboost training dir * close #48 * bs for hierarchical space. close #85 * retrain for hierarchical space * clean ml (#180) Co-authored-by: Qingyun Wu <qxw5138@psu.edu> * support ranking task * examples * cv shuffle * forecast api and implementation cleaner * period constraints * delete groups after fit	2021-09-01 16:25:04 -07:00
Qingyun Wu	a229a6112a	Support parallel and add random search (#167 ) * non hashable value out of signature * parallel trials * add random in _search_parallel * fix bug in retraining * check memory constraint before training * retrain_full * log custom metric * retraining budget check * sample size check before retrain * remove 'time2eval' from result * report 'total_search_time' in result * rename total_search_time to wall_clock_time * rename train_loss boolean to log_training_metric * set default train_loss to None * exclude oom result * log retrained model * no subsample * doc str * notebook * predicted value is NaN for sarimax * version Co-authored-by: Chi Wang <wang.chi@microsoft.com> Co-authored-by: Qingyun Wu <qxw5138@psu.edu>	2021-08-23 16:36:51 -07:00
Kevin Chen	3d0a3d26a2	Forecast (#162 ) * added 'forecast' task with estimators ['fbprophet', 'arima', 'sarimax'] * update setup.py * add TimeSeriesSplit to 'regression' and 'classification' task * add 'time' split_type for 'classification' and 'regression' task Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com> * feature importance * variable name * Update test/test_split.py Co-authored-by: Chi Wang <wang.chi@microsoft.com> * Update test/test_forecast.py Co-authored-by: Chi Wang <wang.chi@microsoft.com> * prophet installation fail in windows * upload flaml_forecast.ipynb Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>	2021-08-23 13:26:46 -07:00
すずまる	6270353458	support ROC and AUC for multi-class classification (#170 ) * support ROC and AUC for multi-class classification * add a test case to cover ROC and AUC for multi-class classification	2021-08-22 15:16:10 -07:00
Qingyun Wu	10082b9262	v0.5.12 (#150 ) * remove extra comma * exclusive bound * log file name * add cost to space * dataset_format * add load_openml_dataset test * docstr * revise test format * simplify restore * order categories * openml server exception in test * process space * add warning * log format * reduce n_cpu * nested space * hierarchical search space for CFO * non hierarchical for bs * unflatten hierarchical config * connection error * random sample * config signature * check ray version * preprocess numpy array * catboost preprocess * time budget * seed, verbose, hpo_method * test cfocat * shallow copy in flatten_dict prevent lgbm model duplication * match estimator name * quantize and log * test qloguniform and qrandint * test qlograndint * thread.running Co-authored-by: Chi Wang <wang.chi@microsoft.com> Co-authored-by: Qingyun Wu <qingyunwu@Qingyuns-MacBook-Pro-2.local>	2021-08-11 23:02:22 -07:00

1 2

62 Commits