{"created":"2023-06-20T13:20:43.767004+00:00","id":763,"links":{},"metadata":{"_buckets":{"deposit":"ec78d315-1b31-433d-b429-3f7a87cbc726"},"_deposit":{"created_by":1,"id":"763","owners":[1],"pid":{"revision_id":0,"type":"depid","value":"763"},"status":"published"},"_oai":{"id":"oai:ir.soken.ac.jp:00000763","sets":["2:429:17"]},"author_link":["9103","9101","9102"],"item_1_creator_2":{"attribute_name":"著者名","attribute_type":"creator","attribute_value_mlt":[{"creatorNames":[{"creatorName":"竹之内, 高志"}],"nameIdentifiers":[{}]}]},"item_1_creator_3":{"attribute_name":"フリガナ","attribute_type":"creator","attribute_value_mlt":[{"creatorNames":[{"creatorName":"タケノウチ, タカシ"}],"nameIdentifiers":[{}]}]},"item_1_date_granted_11":{"attribute_name":"学位授与年月日","attribute_value_mlt":[{"subitem_dategranted":"2004-03-24"}]},"item_1_degree_grantor_5":{"attribute_name":"学位授与機関","attribute_value_mlt":[{"subitem_degreegrantor":[{"subitem_degreegrantor_name":"総合研究大学院大学"}]}]},"item_1_degree_name_6":{"attribute_name":"学位名","attribute_value_mlt":[{"subitem_degreename":"博士(学術)"}]},"item_1_description_12":{"attribute_name":"要旨","attribute_value_mlt":[{"subitem_description":"We deal with statistical learning theory, especially classification problems, by Boosting method. In the context of Boosting method, we can use only a set of weak learners which output statistical discriminant functions having low performance for a given set of examples. Aim of Boosting method is to construct a strong learner by combining a lot of weak learners and a typical boosting algorithm is AdaBoost. AdaBoost can be derived from a sequential minimization of the exponential loss function for a statistical discriminant function. This minimization problem is equivalent to the minimization of the extended Kullback-Leibler divergence between an empirical distribution of given examples and an extended exponential model. Statistical properties of AdaBoost have been investigated and the relationship between the exponential loss function of AdaBoost and the logistic model was revealed. In this thesis, we obtain two main results:

1. AdaBoost is extended to general U-Boost by using the statistical form of the Bregman divergence, which contains the Kullback-Leibler divergence as an example and consider a geometrical interpretation of U-Boost in terms of information geometry.

2. We propose a new Boosting algorithm η-Boost, which is a robustified version of AdaBoost.

The U-Boost is derived from a sequential minimization of the Bregman divergence between the empirical distribution and U-model. A geometric interpretation for U-Boost is given in terms of information geometry. From the Pythagorean relation associated with the Bregman divergence, we derive two special versions of U-Boost, the normalized U-Boost and the unnormalized U-Boost. We define the normalized version of U-model on the probability space and derive normalized U-Boost from this model. The normalized U-Boost corresponds to usual statistical classification methods, for example, logistic discriminant analysis. The unnormalized U-Boost is derived from an unnormalized version of U-model defined on the extended non-negative measure space and has not been seen in the previous statistical context. Especially, unnormalized U-Boost has a beautiful geometrical structure related to the Pythagorean relation and the flatness. Its algorithm is interpreted as a pile of right triangles which leads to a mild convergence property of U-Boost algorithm as seen in the EM algorithm. Based on a probabilistic assumption for a training data set, statistical discussion for consistency, efficiency and robustness of U-Boost is given.

An algorithm of AdaBoost implements the learning process by exponentially reweighting examples according to classification results. Then weight distribution is often too sharply tuned, so that AdaBoost has a weak point on the robustness and over-learning. As a special example of U-Boost, we propose η-Boost which aims to robustify AdaBoost to avoid an over-learning. The statistical meaning of η-Boost is discussed and η-Boost is associated with a probabilistic model of mislabeling which is a contaminated logistic model. As a general U-Boost algorithm, η-Boost also has a normalized and unnormalized version. A loss function of the normalized version of η-Boost is a minus log-likelihood of a contaminated logistic model in which mislabeling probability is constant and does not depend on the input. The unnormalized version of η-Boost is a slight modification of AdaBoost and is derived from a loss function which is defined by a mixture of the exponential loss of AdaBoost and naive error loss functions. A probabilistic model of unnormalized version is also a contaminated logistic model and its mislabeling probability depends on the input. In an algorithm of unnormalized version of η-Boost, a weight distribution of AdaBoost is moderated by an uniform weight distribution and a way of combining a weak learners is adjusted by a naive error rate. As a result, η-Boost incorporates the effect of forgetfulness into AdaBoost. For both versions, a tuning parameter ηis associated with a degree of the contamination of the model and we can choose it by the minimization of naive error rate. We theoretically investigated the robustness of η-Boost and confirmed it with computer experiments. Also, we applied η-Boost to real datasets and compared it with previously proposed Boosting method. The η-Boost outperformed the other method in term of robustness.","subitem_description_type":"Other"}]},"item_1_description_7":{"attribute_name":"学位記番号","attribute_value_mlt":[{"subitem_description":"総研大甲第739号","subitem_description_type":"Other"}]},"item_1_select_14":{"attribute_name":"所蔵","attribute_value_mlt":[{"subitem_select_item":"有"}]},"item_1_select_8":{"attribute_name":"研究科","attribute_value_mlt":[{"subitem_select_item":"数物科学研究科"}]},"item_1_select_9":{"attribute_name":"専攻","attribute_value_mlt":[{"subitem_select_item":"15 統計科学専攻"}]},"item_1_text_10":{"attribute_name":"学位授与年度","attribute_value_mlt":[{"subitem_text_value":"2003"}]},"item_creator":{"attribute_name":"著者","attribute_type":"creator","attribute_value_mlt":[{"creatorNames":[{"creatorName":"TAKENOUCHI, Takashi","creatorNameLang":"en"}],"nameIdentifiers":[{}]}]},"item_files":{"attribute_name":"ファイル情報","attribute_type":"file","attribute_value_mlt":[{"accessrole":"open_date","date":[{"dateType":"Available","dateValue":"2016-02-17"}],"displaytype":"simple","filename":"甲739_要旨.pdf","filesize":[{"value":"281.7 kB"}],"format":"application/pdf","licensetype":"license_11","mimetype":"application/pdf","url":{"label":"要旨・審査要旨 / Abstract, Screening Result","url":"https://ir.soken.ac.jp/record/763/files/甲739_要旨.pdf"},"version_id":"b7ced106-053c-4fc8-80d4-d139ef5c27d7"}]},"item_language":{"attribute_name":"言語","attribute_value_mlt":[{"subitem_language":"eng"}]},"item_resource_type":{"attribute_name":"資源タイプ","attribute_value_mlt":[{"resourcetype":"thesis","resourceuri":"http://purl.org/coar/resource_type/c_46ec"}]},"item_title":"Statistical Learning Theory by Boosting Method","item_titles":{"attribute_name":"タイトル","attribute_value_mlt":[{"subitem_title":"Statistical Learning Theory by Boosting Method"},{"subitem_title":"Statistical Learning Theory by Boosting Method","subitem_title_language":"en"}]},"item_type_id":"1","owner":"1","path":["17"],"pubdate":{"attribute_name":"公開日","attribute_value":"2010-02-22"},"publish_date":"2010-02-22","publish_status":"0","recid":"763","relation_version_is_last":true,"title":["Statistical Learning Theory by Boosting Method"],"weko_creator_id":"1","weko_shared_id":1},"updated":"2023-06-20T14:49:49.242907+00:00"}