autogen

mirror of https://github.com/microsoft/autogen.git synced 2025-07-28 11:20:14 +00:00

Author	SHA1	Message	Date
Kevin Chen	d4273669e6	Time series forecasting with sklearn regressors (#362 ) * add sklearn regressors as learners for ts_forecast task * add direct forecasting strategy warnings and errors for duplicate rows and missing values - add preprocess for sklearn time series forecast update automl.py update test/test_forecast.py * update model.py and test_forecast.py for cv eval_method * add "hcrystalball" dependency in setup.py * update automl.py - add _validate_ts_data function for abstraction - include xgb_limitdepth as a learner * update model.py - update search space for sklearn ts regressors * update automl.py and test_forecast.py for numpy array inputs * add documentations to model.py * add documentation for removing catboost regressor * update automl.py - _validate_ts_data() function Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>	2022-01-06 23:12:38 -08:00
Chi Wang	612668e8ed	serialize TransformerEstimator (#381 ) * serialize TransformerEstimator * check has_attr * custom metric needs trainer * skip test on mac	2022-01-06 10:28:19 -08:00
Chi Wang	cd9740f022	Fix several issues for nlp tasks (#380 ) * num cpu issue #378; * temp fix for ray issue #379; * transformers version.	2022-01-05 13:49:12 -08:00
Xueqing Liu	207b6935d9	adding token classification (#376 ) * adding ner	2022-01-03 13:44:10 -05:00
Chi Wang	8602def1c4	logging (#371 ) * query logged runs * mlflow log when using ray * key check for newer version of ray #363 * catch importerror * log and load AutoML model * retrain if necessary when ensemble fails	2022-01-02 21:37:19 -08:00
oberonbot	9c00e4272a	Finish the Multiple Choice Classification (#367 ) * adding multiple choice * update test cases (hard coded) * merged common code in predict_proba and predict in TransformersEstimator	2022-01-02 20:12:34 -05:00
Xueqing Liu	b2900f4b22	fixing custom metric (#357 ) * fixing the error for custom metric	2021-12-24 16:23:09 -05:00
Xueqing Liu	dcfd218108	Fixing the bug in custom metric (#356 ) * fixing the bug for custom metric	2021-12-23 18:44:53 -05:00
Xueqing Liu	ee3162e232	Adding the NLP task summarization (#346 ) * Add test_autohf_summarization.py * adding seq2seq * Update flaml/nlp/huggingface/trainer.py * rouge metrics Co-authored-by: XinZofStevens <xzhao4346@gmail.com> Co-authored-by: JinzhuoWu <wujinzhuo0105@gmail.com> Co-authored-by: Chi Wang <wang.chi@microsoft.com>	2021-12-20 14:19:32 -08:00
Chi Wang	efd85b4c86	Deploy a new doc website (#338 ) A new documentation website. And: * add actions for doc * update docstr * installation instructions for doc dev * unify README and Getting Started * rename notebook * doc about best_model_for_estimator #340 * docstr for keep_search_state #340 * DNN Co-authored-by: Qingyun Wu <qingyun.wu@psu.edu> Co-authored-by: Z.sk <shaokunzhang@psu.edu>	2021-12-16 17:11:33 -08:00
Chi Wang	434586e2e2	train at least one iter when not trained (#336 ) * train at least one iter when not trained * bump version to 0.9.1	2021-12-12 20:05:18 -08:00
Xueqing Liu	1a3e01c352	adding HF metrics (#335 ) * adding nlp metrics * fix ndcg	2021-12-10 12:32:49 -05:00
Chi Wang	54d303a95a	bug fix in confg2params (#323 ) * bug fix in confg2params * set the task property before config2params	2021-12-03 19:37:49 -08:00
Xueqing Liu	fb59bb9928	adding TODOs for NLP module, so students can implement other tasks easier (#321 ) * fixing ray pickle bug, skipping macosx bug, completing code for seqregression * catching connectionerror * ading TODOs for NLP module	2021-12-03 12:45:16 -05:00
Chi Wang	c57954fbbd	include default value in rf search space (#317 ) * include default value in rf search space * init _mem_per_iter with -1 * bump version to 0.8.2 * docstr for search space's arguments	2021-12-03 09:15:21 -08:00
liususan091219	63f402b29e	fixing config2params for transformersestimator	2021-11-26 21:28:38 -08:00
Xueqing Liu	fd136b02d1	bug fix for TransformerEstimator (#293 ) * fix checkpoint naming + trial id for non-ray mode, fix the bug in running test mode, delete all the checkpoints in non-ray mode * finished testing for checkpoint naming, delete checkpoint, ray, max iter = 1 * adding predict_proba, address PR 293's comments close #293 #291	2021-11-23 11:26:39 -08:00
Chi Wang	ea6d28d7bd	add max_depth to xgboost search space (#282 ) * add max_depth to xgboost search space * notebook update * two learners for xgboost (max_depth or max_leaves)	2021-11-22 21:17:48 -08:00
Chi Wang	72caa2172d	model_history, ITER_HP, settings in AutoML(), checkpoint bug fix (#283 ) if save_best_model_per_estimator is False and retrain_final is True, unfit the model after evaluation in HPO. retrain if using ray. update ITER_HP in config after a trial is finished. change prophet logging level. example and notebook update. allow settings to be passed to AutoML constructor. Are you planning to add multi-output-regression capability to FLAML #192 Is multi-tasking allowed? #277 can pass the auotml setting to the constructor instead of requiring a derived class. remove model_history. checkpoint bug fix. * model_history meaning save_best_model_per_estimator * ITER_HP * example update * prophet logging level * comment update in forecast notebook * print format improvement * allow settings to be passed to AutoML constructor * checkpoint bug fix * time limit for autohf regression test * skip slow test on macos * cleanup before del	2021-11-18 09:39:45 -08:00
Xueqing Liu	42de3075e9	Make NLP tasks available from AutoML.fit() (#210 ) Sequence classification and regression: "seq-classification" and "seq-regression" Co-authored-by: Chi Wang <wang.chi@microsoft.com>	2021-11-16 11:06:20 -08:00
Chi Wang	0d9439212f	update docstr	2021-11-06 09:37:33 -07:00
Chi Wang	549a0dfb53	limit time and memory consumption (#264 ) * limit time and memory * separate tests * lrl1 can't be limited by limit_resource * free memory when possible * passthrough=False when ensemble fails; retrain when trained_estimator is None * use callback to for resource limit * handle lower version of xgb with no callback * free mem ratio * reduce verbosity * retrain_final when max_iter==1 * remove trained_estimator from result * model_history * wheel * retrain time as best_config_train_time * ci: libomp version for xgboost on macos * limit_resource not working in windows * test pickle load * mute forecaster * notebook update * check hard * preventive callback * add use_ray	2021-11-03 19:08:23 -07:00
Kevin Chen	519bfc2a18	Integrate multivariate time series forecasting (#254 ) * Integrate multivariate time series forecasting, now supports continuous and categorical variables - update data.py to transform time series data - update search space - update documentations to reflect changes - update test_forecast.py - rename 'forecast' task to 'ts_forecast' task * update automl.py and test_forecast.py * update forecast notebook * update README.md and setup.py * update ml.py and test_forecast.py - make "ds" and "y" constant variables * replace constants with constant variables * bump version to 0.7.0 * update setup.py - support 'forecast' and 'ts_forecast' * update automl.py and data.py - support 'forecast' and 'ts_forecast' tasks	2021-10-30 09:48:57 -07:00
Chi Wang	7d6e860102	n_estimators for catboost	2021-10-18 21:56:21 -07:00
Chi Wang	b2d8b097d7	check n_iter == 1	2021-10-18 21:56:21 -07:00
Chi Wang	b03a87e737	no search when max_iter < 2	2021-10-18 21:56:21 -07:00
Chi Wang	524f22bcc5	fix bug in hierarchical search space (#248 ); optional dependency on lgbm and xgb (#250 ) * close #249 * admissible region * best_config can be None * optional dependency on lgbm and xgb resolve #252	2021-10-15 21:36:42 -07:00
Chi Wang	f48ca2618f	warning -> info for low cost partial config (#231 ) * warning -> info for low cost partial config #195, #110 * when n_estimators < 0, use trained_estimator's * log debug info * test random seed * remove "objective"; avoid ZeroDivisionError * hp config to estimator params * check type of searcher * default n_jobs * try import * Update searchalgo_auto.py * CLASSIFICATION * auto_augment flag * min_sample_size * make catboost optional	2021-10-08 16:09:43 -07:00
Chi Wang	a99e939404	update config if n_estimators is modified (#225 ) * update config if n_estimators is modified * prediction as int * handle the case n_estimators <= 0 * if trained and no budget to train more, return the trained model * split_type=group for classification & regression	2021-09-27 21:30:49 -07:00
Chi Wang	f4529dfe89	package name in setup (#198 ) * package name * learning to rank example: close #200 * try import prophet #201	2021-09-11 21:19:18 -07:00
Chi Wang	e46573a01d	warmstart blendsearch (#186 ) * increase test coverage * use define by run only when needed * warmstart bs * classification -> binary, multi * warm start with evaluated rewards * data transformer; resource attr for gs * BlendSearchTuner bug fix and unittest * bug fix * docstr and import * task type	2021-09-04 01:42:21 -07:00
Chi Wang	6ab0730793	remove catboost training dir; ensemble api; blendsearch for hierarchical space; ranking task; forecast improvement (#178 ) * remove catboost training dir * close #48 * bs for hierarchical space. close #85 * retrain for hierarchical space * clean ml (#180) Co-authored-by: Qingyun Wu <qxw5138@psu.edu> * support ranking task * examples * cv shuffle * forecast api and implementation cleaner * period constraints * delete groups after fit	2021-09-01 16:25:04 -07:00
Qingyun Wu	a229a6112a	Support parallel and add random search (#167 ) * non hashable value out of signature * parallel trials * add random in _search_parallel * fix bug in retraining * check memory constraint before training * retrain_full * log custom metric * retraining budget check * sample size check before retrain * remove 'time2eval' from result * report 'total_search_time' in result * rename total_search_time to wall_clock_time * rename train_loss boolean to log_training_metric * set default train_loss to None * exclude oom result * log retrained model * no subsample * doc str * notebook * predicted value is NaN for sarimax * version Co-authored-by: Chi Wang <wang.chi@microsoft.com> Co-authored-by: Qingyun Wu <qxw5138@psu.edu>	2021-08-23 16:36:51 -07:00
Kevin Chen	3d0a3d26a2	Forecast (#162 ) * added 'forecast' task with estimators ['fbprophet', 'arima', 'sarimax'] * update setup.py * add TimeSeriesSplit to 'regression' and 'classification' task * add 'time' split_type for 'classification' and 'regression' task Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com> * feature importance * variable name * Update test/test_split.py Co-authored-by: Chi Wang <wang.chi@microsoft.com> * Update test/test_forecast.py Co-authored-by: Chi Wang <wang.chi@microsoft.com> * prophet installation fail in windows * upload flaml_forecast.ipynb Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>	2021-08-23 13:26:46 -07:00
Qingyun Wu	10082b9262	v0.5.12 (#150 ) * remove extra comma * exclusive bound * log file name * add cost to space * dataset_format * add load_openml_dataset test * docstr * revise test format * simplify restore * order categories * openml server exception in test * process space * add warning * log format * reduce n_cpu * nested space * hierarchical search space for CFO * non hierarchical for bs * unflatten hierarchical config * connection error * random sample * config signature * check ray version * preprocess numpy array * catboost preprocess * time budget * seed, verbose, hpo_method * test cfocat * shallow copy in flatten_dict prevent lgbm model duplication * match estimator name * quantize and log * test qloguniform and qrandint * test qlograndint * thread.running Co-authored-by: Chi Wang <wang.chi@microsoft.com> Co-authored-by: Qingyun Wu <qingyunwu@Qingyuns-MacBook-Pro-2.local>	2021-08-11 23:02:22 -07:00
Chi Wang	15fd8adac4	max_leaves (#138 ) * max_leaf_nodes in rf and extra_tree * preprocess numpy str * free up mem after training	2021-07-27 18:02:49 -07:00
Chi Wang	b3bb00966d	coverage (#135 ) * coverage * readme * timeout	2021-07-20 17:00:44 -07:00
Chi Wang	072e9e4588	constraint (#132 ) * constraint * ensemble	2021-07-10 09:02:17 -07:00
Qingyun Wu	a291abfab9	Cha cha (#127 ) * unordered categorical * allow cost attribute to be None * tensorboardX version * quote * cfo cat * trunc * Update version.py * incumbent is normalized * python 3.9 * remove ConcurrencyLimiter * seed * estimator * update autovw notebook Co-authored-by: Chi Wang <wang.chi@microsoft.com> Co-authored-by: Qingyun Wu <qiw@microsoft.com>	2021-07-05 18:17:26 -07:00
Chi Wang	c26720c299	api doc for chacha (#105 ) * api doc for chacha * update params * link to paper * update dataset id Co-authored-by: Chi Wang (MSR) <chiw@microsoft.com> Co-authored-by: Qingyun Wu <qiw@microsoft.com>	2021-06-11 10:25:45 -07:00
Chi Wang	f7cf2ea45a	Multiclass (#99 ) * utility functions * stepsize lower bound	2021-06-04 10:31:33 -07:00
Chi Wang	b206363c9a	metric constraint (#90 ) * penalty change * metric modification * catboost init	2021-05-22 08:51:38 -07:00
Chi Wang	0b23c3a028	stepsize (#86 ) * decrease step size in suggest * initialization of the counters * increase step size * init phase * check converge in suggest	2021-05-06 21:29:38 -07:00
Qingyun Wu	f4f3f4f17b	update image url (#71 ) * update image url * ArffException * OpenMLError is ValueError * CatBoostError * reduce build on push Co-authored-by: Chi Wang (MSR) <wang.chi@microsoft.com>	2021-04-21 01:36:06 -07:00
Qingyun Wu	06045703bf	Lgbm w customized obj (#64 ) * add customized lgbm learner * add comments * fix format issue * format * OpenMLError * add test * add notebook Co-authored-by: Chi Wang (MSR) <chiw@microsoft.com> Co-authored-by: Chi Wang <wang.chi@microsoft.com>	2021-04-10 21:14:28 -04:00
Chi Wang	97a7c114ee	Issue58 (#59 ) * iter per learner * code cleanup	2021-04-08 09:29:55 -07:00
Chi Wang	b7a91e0385	V0.3.0 (#55 ) * flaml v0.3 * low cost partial config	2021-04-06 11:37:52 -07:00
Chi Wang	37d7518a4c	sample weight in xgboost (#54 )	2021-03-31 22:11:56 -07:00
Chi Wang	f28d093522	v0.2.10 (#51 ) * increase search space * None check	2021-03-28 17:54:25 -07:00
Chi Wang	ae5f8e5426	data validation (#45 ) * pickle the AutoML object * get best model per estimator * test deberta * stateless API * prevent divide by zero * test roberta * BlendSearchTuner * delta time * reindex columns when dropping int-indexed columns * test drop columns and small training data * param set for ensemble builder * fillna on copy Co-authored-by: Chi Wang (MSR) <chiw@microsoft.com>	2021-03-19 09:50:47 -07:00

1 2

53 Commits