autogen

mirror of https://github.com/microsoft/autogen.git synced 2025-07-26 02:11:24 +00:00

Author	SHA1	Message	Date
Xueqing Liu	2a8decdc50	fix the post-processing bug in NER (#534 ) * fix conll bug * update DataCollatorForAuto * adding label_list comments	2022-05-10 17:22:57 -04:00
Xueqing Liu	ca35fa969f	refactoring TransformersEstimator to support default and custom_hp (#511 ) * refactoring TransformersEstimator to support default and custom_hp * handling starting_points not in search space * addressing starting point more than max_iter * fixing upper < lower bug	2022-04-28 14:06:29 -04:00
Xueqing Liu	5f97532986	adding evaluation (#495 ) * adding automl.score * fixing the metric name in train_with_config * adding pickle after score * fixing a bug in automl.pickle	2022-03-25 17:00:08 -04:00
Xueqing Liu	af423463c3	fixing bug for ner (#463 ) * fixing bug for ner * removing global var * adding class for trial counter * adding notebook * adding use_ray dict * updating documentation for nlp	2022-03-20 22:03:02 -04:00
Kevin Chen	81f54026c9	Support time series forecasting for discrete target variable (#416 ) * support 'ts_forecast_classification' task to forecast discrete values * update test_forecast.py - add test for forecasting discrete values * update test_model.py * pre-commit changes	2022-01-24 18:39:36 -08:00
Xueqing Liu	f41f1c2198	Logging multiple checkpoints (#394 )	2022-01-12 19:50:39 -08:00
Kevin Chen	d4273669e6	Time series forecasting with sklearn regressors (#362 ) * add sklearn regressors as learners for ts_forecast task * add direct forecasting strategy warnings and errors for duplicate rows and missing values - add preprocess for sklearn time series forecast update automl.py update test/test_forecast.py * update model.py and test_forecast.py for cv eval_method * add "hcrystalball" dependency in setup.py * update automl.py - add _validate_ts_data function for abstraction - include xgb_limitdepth as a learner * update model.py - update search space for sklearn ts regressors * update automl.py and test_forecast.py for numpy array inputs * add documentations to model.py * add documentation for removing catboost regressor * update automl.py - _validate_ts_data() function Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>	2022-01-06 23:12:38 -08:00
Xueqing Liu	207b6935d9	adding token classification (#376 ) * adding ner	2022-01-03 13:44:10 -05:00
Xueqing Liu	ee3162e232	Adding the NLP task summarization (#346 ) * Add test_autohf_summarization.py * adding seq2seq * Update flaml/nlp/huggingface/trainer.py * rouge metrics Co-authored-by: XinZofStevens <xzhao4346@gmail.com> Co-authored-by: JinzhuoWu <wujinzhuo0105@gmail.com> Co-authored-by: Chi Wang <wang.chi@microsoft.com>	2021-12-20 14:19:32 -08:00
Chia-Chi Hsu	671ccbbe3f	support for customized splitters (#333 ) * add support for customized splitters * use the param split_type for feeding generators * use single API for customized splitter and add test * when task==TS_FORCAST, always set shuffle=False * update docstr Co-authored-by: Chi Wang <wang.chi@microsoft.com>	2021-12-16 16:13:04 -08:00
Xueqing Liu	1a3e01c352	adding HF metrics (#335 ) * adding nlp metrics * fix ndcg	2021-12-10 12:32:49 -05:00
Chi Wang	18230ed22f	pred_time_limit clarification and logging (#319 ) * pred_time_limit clarification * log prediction time * handle ChunkedEncodingError in test	2021-12-03 16:02:00 -08:00
Chi Wang	85e21864ce	test -> val; docstr (#300 ) * rename test -> val in custom metric function * add an example in docstr resolve #299	2021-11-22 22:17:29 -08:00
Chi Wang	ea6d28d7bd	add max_depth to xgboost search space (#282 ) * add max_depth to xgboost search space * notebook update * two learners for xgboost (max_depth or max_leaves)	2021-11-22 21:17:48 -08:00
Xueqing Liu	42de3075e9	Make NLP tasks available from AutoML.fit() (#210 ) Sequence classification and regression: "seq-classification" and "seq-regression" Co-authored-by: Chi Wang <wang.chi@microsoft.com>	2021-11-16 11:06:20 -08:00
Chi Wang	5b0932e442	Unify regression and classification for XGBoost (#276 ) * scikit-learn API for XGBoostRegressor	2021-11-09 21:23:54 -08:00
Chi Wang	c4d5986ee8	no retraining when max_iter=0 and not retrain_full	2021-11-06 11:37:57 -07:00
Chi Wang	0d9439212f	update docstr	2021-11-06 09:37:33 -07:00
Kevin Chen	519bfc2a18	Integrate multivariate time series forecasting (#254 ) * Integrate multivariate time series forecasting, now supports continuous and categorical variables - update data.py to transform time series data - update search space - update documentations to reflect changes - update test_forecast.py - rename 'forecast' task to 'ts_forecast' task * update automl.py and test_forecast.py * update forecast notebook * update README.md and setup.py * update ml.py and test_forecast.py - make "ds" and "y" constant variables * replace constants with constant variables * bump version to 0.7.0 * update setup.py - support 'forecast' and 'ts_forecast' * update automl.py and data.py - support 'forecast' and 'ts_forecast' tasks	2021-10-30 09:48:57 -07:00
Chi Wang	7d6e860102	n_estimators for catboost	2021-10-18 21:56:21 -07:00
Chi Wang	f48ca2618f	warning -> info for low cost partial config (#231 ) * warning -> info for low cost partial config #195, #110 * when n_estimators < 0, use trained_estimator's * log debug info * test random seed * remove "objective"; avoid ZeroDivisionError * hp config to estimator params * check type of searcher * default n_jobs * try import * Update searchalgo_auto.py * CLASSIFICATION * auto_augment flag * min_sample_size * make catboost optional	2021-10-08 16:09:43 -07:00
Chi Wang	a99e939404	update config if n_estimators is modified (#225 ) * update config if n_estimators is modified * prediction as int * handle the case n_estimators <= 0 * if trained and no budget to train more, return the trained model * split_type=group for classification & regression	2021-09-27 21:30:49 -07:00
Chi Wang	16a97bec76	set converge flag when no trial can be sampled (#217 ) * set converge flag when no trial can be sampled * require custom_metric to return dict for logging close #218 * estimate time budget needed * log info per iteration	2021-09-23 10:49:02 -07:00
Chi Wang	f4529dfe89	package name in setup (#198 ) * package name * learning to rank example: close #200 * try import prophet #201	2021-09-11 21:19:18 -07:00
Chi Wang	e46573a01d	warmstart blendsearch (#186 ) * increase test coverage * use define by run only when needed * warmstart bs * classification -> binary, multi * warm start with evaluated rewards * data transformer; resource attr for gs * BlendSearchTuner bug fix and unittest * bug fix * docstr and import * task type	2021-09-04 01:42:21 -07:00
Qingyun Wu	5fdfa2559b	Cleanml (#185 ) * reorg ml * return y_pred in eval_estimator * add train loss into metric_for_logging dict	2021-09-02 13:07:30 -07:00
Chi Wang	6ab0730793	remove catboost training dir; ensemble api; blendsearch for hierarchical space; ranking task; forecast improvement (#178 ) * remove catboost training dir * close #48 * bs for hierarchical space. close #85 * retrain for hierarchical space * clean ml (#180) Co-authored-by: Qingyun Wu <qxw5138@psu.edu> * support ranking task * examples * cv shuffle * forecast api and implementation cleaner * period constraints * delete groups after fit	2021-09-01 16:25:04 -07:00
Qingyun Wu	a229a6112a	Support parallel and add random search (#167 ) * non hashable value out of signature * parallel trials * add random in _search_parallel * fix bug in retraining * check memory constraint before training * retrain_full * log custom metric * retraining budget check * sample size check before retrain * remove 'time2eval' from result * report 'total_search_time' in result * rename total_search_time to wall_clock_time * rename train_loss boolean to log_training_metric * set default train_loss to None * exclude oom result * log retrained model * no subsample * doc str * notebook * predicted value is NaN for sarimax * version Co-authored-by: Chi Wang <wang.chi@microsoft.com> Co-authored-by: Qingyun Wu <qxw5138@psu.edu>	2021-08-23 16:36:51 -07:00
Kevin Chen	3d0a3d26a2	Forecast (#162 ) * added 'forecast' task with estimators ['fbprophet', 'arima', 'sarimax'] * update setup.py * add TimeSeriesSplit to 'regression' and 'classification' task * add 'time' split_type for 'classification' and 'regression' task Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com> * feature importance * variable name * Update test/test_split.py Co-authored-by: Chi Wang <wang.chi@microsoft.com> * Update test/test_forecast.py Co-authored-by: Chi Wang <wang.chi@microsoft.com> * prophet installation fail in windows * upload flaml_forecast.ipynb Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>	2021-08-23 13:26:46 -07:00
すずまる	6270353458	support ROC and AUC for multi-class classification (#170 ) * support ROC and AUC for multi-class classification * add a test case to cover ROC and AUC for multi-class classification	2021-08-22 15:16:10 -07:00
Qingyun Wu	10082b9262	v0.5.12 (#150 ) * remove extra comma * exclusive bound * log file name * add cost to space * dataset_format * add load_openml_dataset test * docstr * revise test format * simplify restore * order categories * openml server exception in test * process space * add warning * log format * reduce n_cpu * nested space * hierarchical search space for CFO * non hierarchical for bs * unflatten hierarchical config * connection error * random sample * config signature * check ray version * preprocess numpy array * catboost preprocess * time budget * seed, verbose, hpo_method * test cfocat * shallow copy in flatten_dict prevent lgbm model duplication * match estimator name * quantize and log * test qloguniform and qrandint * test qlograndint * thread.running Co-authored-by: Chi Wang <wang.chi@microsoft.com> Co-authored-by: Qingyun Wu <qingyunwu@Qingyuns-MacBook-Pro-2.local>	2021-08-11 23:02:22 -07:00
Chi Wang	072e9e4588	constraint (#132 ) * constraint * ensemble	2021-07-10 09:02:17 -07:00
Chi Wang	e039861ab0	multiple logged metrics in cv (#114 )	2021-06-18 21:19:59 -07:00
Chi Wang	183b867856	groups (#107 ) * groups * version * developer's guide	2021-06-15 18:52:57 -07:00
Chi Wang	f7cf2ea45a	Multiclass (#99 ) * utility functions * stepsize lower bound	2021-06-04 10:31:33 -07:00
Gian Pio Domiziani	c4c15f533f	datetime feature engineering added. (#89 ) * datetime feature engineering added. * check if datetime in columns moved after drop check. Check if the new columns do not already exist. * check the drop condition before to add new_column. In transform, check directly if new columns are present in num_column. * check if new_column is in X.columns. * fixed lint issue. update version to 0.4.1.	2021-05-25 08:30:08 -07:00
Chi Wang	0b23c3a028	stepsize (#86 ) * decrease step size in suggest * initialization of the counters * increase step size * init phase * check converge in suggest	2021-05-06 21:29:38 -07:00
Gian Pio Domiziani	730fd14ef6	micro/macro f1 metrics added. (#80 ) * micro/macro f1 metrics added. * format lines.	2021-04-26 14:50:41 -04:00
Chi Wang	97a7c114ee	Issue58 (#59 ) * iter per learner * code cleanup	2021-04-08 09:29:55 -07:00
Chi Wang (MSR)	14d59effbe	bug fix	2021-02-05 22:45:02 -08:00
Chi Wang	776aa55189	V0.2.2 (#19 ) * v0.2.2 separate the HPO part into the module flaml.tune enhanced implementation of FLOW^2, CFO and BlendSearch support parallel tuning using ray tune add support for sample_weight and generic fit arguments enable mlflow logging Co-authored-by: Chi Wang (MSR) <chiw@microsoft.com> Co-authored-by: qingyun-wu <qw2ky@virginia.edu>	2021-02-05 21:41:14 -08:00
Eric Zhu	4ce908f42e	Fix #11 ; add tests for training log and python logger (#12 )	2020-12-14 23:10:03 -08:00
Chi Wang (MSR)	492990655d	v0.1.0	2020-12-04 09:40:27 -08:00

43 Commits