Regression vs Random Forest - Combination of features2019 Community Moderator ElectionHow important is lookahead search in decision trees?feature importance via random forest and linear regression are differentsklearn random forest and fitting with continuous featuresWhy do we pick random features in random forestMultiple time-series predictions with Random Forests (in Python)Forecast Model recognize future trendFeatures selection/combination for random forestGet frequent features of scikitlearn random forestMetrics to evaluate features' importance in classification problem (with random forest)Mean Absolute Error in Random Forest Regression

Can a virus destroy the BIOS of a modern computer?

Why are UK visa biometrics appointments suspended at USCIS Application Support Centers?

Why didn't Boeing produce its own regional jet?

How can saying a song's name be a copyright violation?

How to find if SQL server backup is encrypted with TDE without restoring the backup

Bullying boss launched a smear campaign and made me unemployable

Do Iron Man suits sport waste management systems?

Why is the sentence "Das ist eine Nase" correct?

How to stretch the corners of this image so that it looks like a perfect rectangle?

What are the G forces leaving Earth orbit?

How to remove border from elements in the last row?

Am I breaking OOP practice with this architecture?

Why is it a bad idea to hire a hitman to eliminate most corrupt politicians?

My ex-girlfriend uses my Apple ID to log in to her iPad. Do I have to give her my Apple ID password to reset it?

Should I tell management that I intend to leave due to bad software development practices?

How do I exit BASH while loop using modulus operator?

What historical events would have to change in order to make 19th century "steampunk" technology possible?

Does marriage to a non-Numenorean disqualify a candidate for the crown of Gondor?

What do you call someone who asks many questions?

Unlock My Phone! February 2018

How do conventional missiles fly?

Different meanings of こわい

What is a Samsaran Word™?

Do creatures with a listed speed of "0 ft., fly 30 ft. (hover)" ever touch the ground?



Regression vs Random Forest - Combination of features



2019 Community Moderator ElectionHow important is lookahead search in decision trees?feature importance via random forest and linear regression are differentsklearn random forest and fitting with continuous featuresWhy do we pick random features in random forestMultiple time-series predictions with Random Forests (in Python)Forecast Model recognize future trendFeatures selection/combination for random forestGet frequent features of scikitlearn random forestMetrics to evaluate features' importance in classification problem (with random forest)Mean Absolute Error in Random Forest Regression










3












$begingroup$


I had a discussion with a friend and we were talking about the advantages of random forest over linear regression.



At some point, my friend said that one of the advantages of the random forest over the linear regression is that it takes automatically into account the combination of features.



By this he meant that if I have a model with



  • Y as a target

  • X, W, Z as the predictors

then the random forests tests also the combinations of the features (e.g. X+W) whereas in linear regression you have to build these manually and insert them at the model.



I am quite confused, is this true?



Also if it true then is it about any kind of combination of features (e.g. X*W, X+W+Z etc) or only for some specific ones (e.g. X+W)?










share|improve this question











$endgroup$
















    3












    $begingroup$


    I had a discussion with a friend and we were talking about the advantages of random forest over linear regression.



    At some point, my friend said that one of the advantages of the random forest over the linear regression is that it takes automatically into account the combination of features.



    By this he meant that if I have a model with



    • Y as a target

    • X, W, Z as the predictors

    then the random forests tests also the combinations of the features (e.g. X+W) whereas in linear regression you have to build these manually and insert them at the model.



    I am quite confused, is this true?



    Also if it true then is it about any kind of combination of features (e.g. X*W, X+W+Z etc) or only for some specific ones (e.g. X+W)?










    share|improve this question











    $endgroup$














      3












      3








      3


      1



      $begingroup$


      I had a discussion with a friend and we were talking about the advantages of random forest over linear regression.



      At some point, my friend said that one of the advantages of the random forest over the linear regression is that it takes automatically into account the combination of features.



      By this he meant that if I have a model with



      • Y as a target

      • X, W, Z as the predictors

      then the random forests tests also the combinations of the features (e.g. X+W) whereas in linear regression you have to build these manually and insert them at the model.



      I am quite confused, is this true?



      Also if it true then is it about any kind of combination of features (e.g. X*W, X+W+Z etc) or only for some specific ones (e.g. X+W)?










      share|improve this question











      $endgroup$




      I had a discussion with a friend and we were talking about the advantages of random forest over linear regression.



      At some point, my friend said that one of the advantages of the random forest over the linear regression is that it takes automatically into account the combination of features.



      By this he meant that if I have a model with



      • Y as a target

      • X, W, Z as the predictors

      then the random forests tests also the combinations of the features (e.g. X+W) whereas in linear regression you have to build these manually and insert them at the model.



      I am quite confused, is this true?



      Also if it true then is it about any kind of combination of features (e.g. X*W, X+W+Z etc) or only for some specific ones (e.g. X+W)?







      feature-selection random-forest feature-engineering






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited 2 days ago







      Poete Maudit

















      asked 2 days ago









      Poete MauditPoete Maudit

      406315




      406315




















          3 Answers
          3






          active

          oldest

          votes


















          2












          $begingroup$

          I think it is true. Tree based algorithms especially the ones with multiple trees has the capability of capturing different feature interactions. Please see this article from xgboost official documentation and this discussion. You can say it's a perk of being a non parametric model (trees are non parametric and linear regression is not). I hope this will shed some light on this thought.






          share|improve this answer











          $endgroup$












          • $begingroup$
            (+1) As an example,Tree 1 works with features (A, B) and gives 80% accuracy, Tree 2 works with features (C, D) and gives 60%. A boosting algorithm puts more weight on Tree 1, thus effectively favors f(A, B) over g(C, D).
            $endgroup$
            – Esmailian
            2 days ago











          • $begingroup$
            Thank you for your answer. However, to be honest I would like a more in depth answer. To start with, my second question is still unanswered I think: "Also if it true then is it about any kind of combination of features (e.g. X*W, X+W+Z etc) or only for some specific ones (e.g. X+W)?"
            $endgroup$
            – Poete Maudit
            yesterday










          • $begingroup$
            Please refer this link ( mariofilho.com/can-gradient-boosting-learn-simple-arithmetic ) . This article talks about how boosting trees can model arithmetic operations like X*W, X/W, etc. Theoretically, it is possible. Trees are like neural networks, they are universal approximator (Theoretically). And I am stressing on the word Theoretically.
            $endgroup$
            – tam
            yesterday










          • $begingroup$
            Ok thank you for this too. However, to start with both the other people here are claiming the opposite than you so it is quite difficult for me to draw a definite conclusion.
            $endgroup$
            – Poete Maudit
            yesterday










          • $begingroup$
            Also by the way at your answer you are saying "... has the capability of capturing different feature interactions". However, my question is whether is built-in in random forest (or in boosting algos). In a sense, linear regression also has the "capability" of doing this but exactly you will have to programme it i.e. add some lines of code where you are adding, multiplying some of the features etc.
            $endgroup$
            – Poete Maudit
            yesterday


















          1












          $begingroup$

          I would say it is not true as Random forests which are made up of decision trees does perform feature selection but they do not perform feature engineering (feature selection is different from feature engineering). Decision trees use a metric called Information gain (which is total entropy minus the weighted entropy) as per which useful features are separated from bad features. Simply to say whichever feature exhibit the highest information gain on this iteration is chosen as the node on which the tree on this iteration is split or you can say which feature reduces the entropy(aka randomness) the most in this iteration is chosen as the node upon which the tree is split on this iteration. So if you data is text, trees are split upon words. If your data is real valued numbers, tree is split upon that. Hope it helps



          For more details check this






          share|improve this answer











          $endgroup$












          • $begingroup$
            Thank you for your answer. However, to be honest I would like a more in depth answer. To start with, my second question is still unanswered I think: "Also if it true then is it about any kind of combination of features (e.g. X*W, X+W+Z etc) or only for some specific ones (e.g. X+W)?"
            $endgroup$
            – Poete Maudit
            yesterday










          • $begingroup$
            Yes as said in my previous answer, decision trees cannot perform feature engineering by themselves. They pick the right feature based on information gain which is called as the feature selection. So (X+W), (X*W) or any sort of simple or complex feature engineered features are not possible in case of tree based models. So answer to your second question is "No, Tree based methods cannot and will not perform feature engineering on their own". Hope it's clear
            $endgroup$
            – karthikeyan mg
            yesterday











          • $begingroup$
            Now it is significantly clearer because your starting phrase "I would say it is partly true as Random forests..." confuses things a bit. So basically at my question your answer is "no it is not true; random forest does not take into account the combination of features e.g. X+W etc". It would be good to modify a bit your post because this is not evident.
            $endgroup$
            – Poete Maudit
            yesterday










          • $begingroup$
            However, I will have to see some evidence on why the boosting algorithms do this while the bagging algorithms do not. Also, in the case of the boosting algorithms how the algorithm chooses which of the various combinations to test?
            $endgroup$
            – Poete Maudit
            yesterday










          • $begingroup$
            Thanks for the suggestion, I've made the changes. And regarding your last comment, just to be clear, random forests comes under bagging algos and gbdt, xgboost comes under boosting. I'd suggest you draft another question explaining your last comment in detail along with your thoughts and understanding and link the question here, We will try our best to help you! Cheers
            $endgroup$
            – karthikeyan mg
            yesterday


















          0












          $begingroup$

          No, it is not true.



          In Random Forest (or Decision Tree, or Regression Tree), individual features are compared to each other, not a combination of them, then the most informative individual is peaked to split a node. Therefore, there is no notion of "better combination" in the whole process.



          Furthermore, Random Forest is a bagging algorithm which does not favor the randomly-built trees (or their sub-trees) over each other, they all have the same weight in the aggregated output.



          It is worth noting that "Rotation forest" first applies PCA to features, which means each new feature is a linear combination of original features. However, this does not count since the same pre-processing can be used for any other method too.



          Here is a quote from this wiki page for more details:




          A decision tree is a flow-chart-like structure, where each internal
          (non-leaf) node denotes a test on an attribute.




          EDIT:



          @tam provided a nice counter-example for XGBoost, which is not the same as Random Forest. That is, tree boosting algorithms have a notion of "better combination" that, for example, favors (in regression) $f(X, Y, W)=X*Y+textexp(X+Y) + 0 times W$ over $g(X, Y, W)=0 times X + Y+W$ by placing more weight on $f$. Note that a sub-tree, or a complete tree represents a function $f(boldsymbolX=(X,Y,Z,...))$ over a specific region $R subset (Bbb X, Bbb Y, Bbb Z,...)$ in the feature space.



          In XGBoost, without going into details, more weight is put on a function that leads to more decrease in the overall error, loosely similar to AdaBoost algorithm (XGBoost in more detail, Adaboost vs XGBoost). This is equivalent to favoring a complicated-combination-of-features over another one.



          Note that, a tree can approximate any continuous function $f$ over training points, since it is a universal approximator just like neural networks.






          share|improve this answer











          $endgroup$












          • $begingroup$
            Thank you for your answer. My post triggered some opposing views and now in this sense I do not know yet which side to take. By the way, my impression is that the remark of @tam is not really directly to the point. The fact that tree boosting algorithms favor f(X, Y) over g(Y, W) does not necessarily mean that they take into account the combination of the features in the sense of e.g. X+W but they simply favor groups of features over other groups of features. Thus, not combination of features but groups of features (if I am not missing anything).
            $endgroup$
            – Poete Maudit
            yesterday










          • $begingroup$
            @PoeteMaudit I added an example.
            $endgroup$
            – Esmailian
            yesterday










          • $begingroup$
            Cool, thank you. However, I will have to see some evidence on why the boosting algorithms do this while the bagging algorithms do not. Also, in the case of the boosting algorithms how the algorithm chooses which of the various combinations to test?
            $endgroup$
            – Poete Maudit
            yesterday






          • 1




            $begingroup$
            So your answer to my question is that "Note that, a tree can approximate any continuous function f over training points, since it is a universal approximator just like neural networks."? If so then this is interesting.
            $endgroup$
            – Poete Maudit
            yesterday











          Your Answer





          StackExchange.ifUsing("editor", function ()
          return StackExchange.using("mathjaxEditing", function ()
          StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
          StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
          );
          );
          , "mathjax-editing");

          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "557"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader:
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          ,
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );













          draft saved

          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f48294%2fregression-vs-random-forest-combination-of-features%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown

























          3 Answers
          3






          active

          oldest

          votes








          3 Answers
          3






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          2












          $begingroup$

          I think it is true. Tree based algorithms especially the ones with multiple trees has the capability of capturing different feature interactions. Please see this article from xgboost official documentation and this discussion. You can say it's a perk of being a non parametric model (trees are non parametric and linear regression is not). I hope this will shed some light on this thought.






          share|improve this answer











          $endgroup$












          • $begingroup$
            (+1) As an example,Tree 1 works with features (A, B) and gives 80% accuracy, Tree 2 works with features (C, D) and gives 60%. A boosting algorithm puts more weight on Tree 1, thus effectively favors f(A, B) over g(C, D).
            $endgroup$
            – Esmailian
            2 days ago











          • $begingroup$
            Thank you for your answer. However, to be honest I would like a more in depth answer. To start with, my second question is still unanswered I think: "Also if it true then is it about any kind of combination of features (e.g. X*W, X+W+Z etc) or only for some specific ones (e.g. X+W)?"
            $endgroup$
            – Poete Maudit
            yesterday










          • $begingroup$
            Please refer this link ( mariofilho.com/can-gradient-boosting-learn-simple-arithmetic ) . This article talks about how boosting trees can model arithmetic operations like X*W, X/W, etc. Theoretically, it is possible. Trees are like neural networks, they are universal approximator (Theoretically). And I am stressing on the word Theoretically.
            $endgroup$
            – tam
            yesterday










          • $begingroup$
            Ok thank you for this too. However, to start with both the other people here are claiming the opposite than you so it is quite difficult for me to draw a definite conclusion.
            $endgroup$
            – Poete Maudit
            yesterday










          • $begingroup$
            Also by the way at your answer you are saying "... has the capability of capturing different feature interactions". However, my question is whether is built-in in random forest (or in boosting algos). In a sense, linear regression also has the "capability" of doing this but exactly you will have to programme it i.e. add some lines of code where you are adding, multiplying some of the features etc.
            $endgroup$
            – Poete Maudit
            yesterday















          2












          $begingroup$

          I think it is true. Tree based algorithms especially the ones with multiple trees has the capability of capturing different feature interactions. Please see this article from xgboost official documentation and this discussion. You can say it's a perk of being a non parametric model (trees are non parametric and linear regression is not). I hope this will shed some light on this thought.






          share|improve this answer











          $endgroup$












          • $begingroup$
            (+1) As an example,Tree 1 works with features (A, B) and gives 80% accuracy, Tree 2 works with features (C, D) and gives 60%. A boosting algorithm puts more weight on Tree 1, thus effectively favors f(A, B) over g(C, D).
            $endgroup$
            – Esmailian
            2 days ago











          • $begingroup$
            Thank you for your answer. However, to be honest I would like a more in depth answer. To start with, my second question is still unanswered I think: "Also if it true then is it about any kind of combination of features (e.g. X*W, X+W+Z etc) or only for some specific ones (e.g. X+W)?"
            $endgroup$
            – Poete Maudit
            yesterday










          • $begingroup$
            Please refer this link ( mariofilho.com/can-gradient-boosting-learn-simple-arithmetic ) . This article talks about how boosting trees can model arithmetic operations like X*W, X/W, etc. Theoretically, it is possible. Trees are like neural networks, they are universal approximator (Theoretically). And I am stressing on the word Theoretically.
            $endgroup$
            – tam
            yesterday










          • $begingroup$
            Ok thank you for this too. However, to start with both the other people here are claiming the opposite than you so it is quite difficult for me to draw a definite conclusion.
            $endgroup$
            – Poete Maudit
            yesterday










          • $begingroup$
            Also by the way at your answer you are saying "... has the capability of capturing different feature interactions". However, my question is whether is built-in in random forest (or in boosting algos). In a sense, linear regression also has the "capability" of doing this but exactly you will have to programme it i.e. add some lines of code where you are adding, multiplying some of the features etc.
            $endgroup$
            – Poete Maudit
            yesterday













          2












          2








          2





          $begingroup$

          I think it is true. Tree based algorithms especially the ones with multiple trees has the capability of capturing different feature interactions. Please see this article from xgboost official documentation and this discussion. You can say it's a perk of being a non parametric model (trees are non parametric and linear regression is not). I hope this will shed some light on this thought.






          share|improve this answer











          $endgroup$



          I think it is true. Tree based algorithms especially the ones with multiple trees has the capability of capturing different feature interactions. Please see this article from xgboost official documentation and this discussion. You can say it's a perk of being a non parametric model (trees are non parametric and linear regression is not). I hope this will shed some light on this thought.







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited 2 days ago

























          answered 2 days ago









          tamtam

          814




          814











          • $begingroup$
            (+1) As an example,Tree 1 works with features (A, B) and gives 80% accuracy, Tree 2 works with features (C, D) and gives 60%. A boosting algorithm puts more weight on Tree 1, thus effectively favors f(A, B) over g(C, D).
            $endgroup$
            – Esmailian
            2 days ago











          • $begingroup$
            Thank you for your answer. However, to be honest I would like a more in depth answer. To start with, my second question is still unanswered I think: "Also if it true then is it about any kind of combination of features (e.g. X*W, X+W+Z etc) or only for some specific ones (e.g. X+W)?"
            $endgroup$
            – Poete Maudit
            yesterday










          • $begingroup$
            Please refer this link ( mariofilho.com/can-gradient-boosting-learn-simple-arithmetic ) . This article talks about how boosting trees can model arithmetic operations like X*W, X/W, etc. Theoretically, it is possible. Trees are like neural networks, they are universal approximator (Theoretically). And I am stressing on the word Theoretically.
            $endgroup$
            – tam
            yesterday










          • $begingroup$
            Ok thank you for this too. However, to start with both the other people here are claiming the opposite than you so it is quite difficult for me to draw a definite conclusion.
            $endgroup$
            – Poete Maudit
            yesterday










          • $begingroup$
            Also by the way at your answer you are saying "... has the capability of capturing different feature interactions". However, my question is whether is built-in in random forest (or in boosting algos). In a sense, linear regression also has the "capability" of doing this but exactly you will have to programme it i.e. add some lines of code where you are adding, multiplying some of the features etc.
            $endgroup$
            – Poete Maudit
            yesterday
















          • $begingroup$
            (+1) As an example,Tree 1 works with features (A, B) and gives 80% accuracy, Tree 2 works with features (C, D) and gives 60%. A boosting algorithm puts more weight on Tree 1, thus effectively favors f(A, B) over g(C, D).
            $endgroup$
            – Esmailian
            2 days ago











          • $begingroup$
            Thank you for your answer. However, to be honest I would like a more in depth answer. To start with, my second question is still unanswered I think: "Also if it true then is it about any kind of combination of features (e.g. X*W, X+W+Z etc) or only for some specific ones (e.g. X+W)?"
            $endgroup$
            – Poete Maudit
            yesterday










          • $begingroup$
            Please refer this link ( mariofilho.com/can-gradient-boosting-learn-simple-arithmetic ) . This article talks about how boosting trees can model arithmetic operations like X*W, X/W, etc. Theoretically, it is possible. Trees are like neural networks, they are universal approximator (Theoretically). And I am stressing on the word Theoretically.
            $endgroup$
            – tam
            yesterday










          • $begingroup$
            Ok thank you for this too. However, to start with both the other people here are claiming the opposite than you so it is quite difficult for me to draw a definite conclusion.
            $endgroup$
            – Poete Maudit
            yesterday










          • $begingroup$
            Also by the way at your answer you are saying "... has the capability of capturing different feature interactions". However, my question is whether is built-in in random forest (or in boosting algos). In a sense, linear regression also has the "capability" of doing this but exactly you will have to programme it i.e. add some lines of code where you are adding, multiplying some of the features etc.
            $endgroup$
            – Poete Maudit
            yesterday















          $begingroup$
          (+1) As an example,Tree 1 works with features (A, B) and gives 80% accuracy, Tree 2 works with features (C, D) and gives 60%. A boosting algorithm puts more weight on Tree 1, thus effectively favors f(A, B) over g(C, D).
          $endgroup$
          – Esmailian
          2 days ago





          $begingroup$
          (+1) As an example,Tree 1 works with features (A, B) and gives 80% accuracy, Tree 2 works with features (C, D) and gives 60%. A boosting algorithm puts more weight on Tree 1, thus effectively favors f(A, B) over g(C, D).
          $endgroup$
          – Esmailian
          2 days ago













          $begingroup$
          Thank you for your answer. However, to be honest I would like a more in depth answer. To start with, my second question is still unanswered I think: "Also if it true then is it about any kind of combination of features (e.g. X*W, X+W+Z etc) or only for some specific ones (e.g. X+W)?"
          $endgroup$
          – Poete Maudit
          yesterday




          $begingroup$
          Thank you for your answer. However, to be honest I would like a more in depth answer. To start with, my second question is still unanswered I think: "Also if it true then is it about any kind of combination of features (e.g. X*W, X+W+Z etc) or only for some specific ones (e.g. X+W)?"
          $endgroup$
          – Poete Maudit
          yesterday












          $begingroup$
          Please refer this link ( mariofilho.com/can-gradient-boosting-learn-simple-arithmetic ) . This article talks about how boosting trees can model arithmetic operations like X*W, X/W, etc. Theoretically, it is possible. Trees are like neural networks, they are universal approximator (Theoretically). And I am stressing on the word Theoretically.
          $endgroup$
          – tam
          yesterday




          $begingroup$
          Please refer this link ( mariofilho.com/can-gradient-boosting-learn-simple-arithmetic ) . This article talks about how boosting trees can model arithmetic operations like X*W, X/W, etc. Theoretically, it is possible. Trees are like neural networks, they are universal approximator (Theoretically). And I am stressing on the word Theoretically.
          $endgroup$
          – tam
          yesterday












          $begingroup$
          Ok thank you for this too. However, to start with both the other people here are claiming the opposite than you so it is quite difficult for me to draw a definite conclusion.
          $endgroup$
          – Poete Maudit
          yesterday




          $begingroup$
          Ok thank you for this too. However, to start with both the other people here are claiming the opposite than you so it is quite difficult for me to draw a definite conclusion.
          $endgroup$
          – Poete Maudit
          yesterday












          $begingroup$
          Also by the way at your answer you are saying "... has the capability of capturing different feature interactions". However, my question is whether is built-in in random forest (or in boosting algos). In a sense, linear regression also has the "capability" of doing this but exactly you will have to programme it i.e. add some lines of code where you are adding, multiplying some of the features etc.
          $endgroup$
          – Poete Maudit
          yesterday




          $begingroup$
          Also by the way at your answer you are saying "... has the capability of capturing different feature interactions". However, my question is whether is built-in in random forest (or in boosting algos). In a sense, linear regression also has the "capability" of doing this but exactly you will have to programme it i.e. add some lines of code where you are adding, multiplying some of the features etc.
          $endgroup$
          – Poete Maudit
          yesterday











          1












          $begingroup$

          I would say it is not true as Random forests which are made up of decision trees does perform feature selection but they do not perform feature engineering (feature selection is different from feature engineering). Decision trees use a metric called Information gain (which is total entropy minus the weighted entropy) as per which useful features are separated from bad features. Simply to say whichever feature exhibit the highest information gain on this iteration is chosen as the node on which the tree on this iteration is split or you can say which feature reduces the entropy(aka randomness) the most in this iteration is chosen as the node upon which the tree is split on this iteration. So if you data is text, trees are split upon words. If your data is real valued numbers, tree is split upon that. Hope it helps



          For more details check this






          share|improve this answer











          $endgroup$












          • $begingroup$
            Thank you for your answer. However, to be honest I would like a more in depth answer. To start with, my second question is still unanswered I think: "Also if it true then is it about any kind of combination of features (e.g. X*W, X+W+Z etc) or only for some specific ones (e.g. X+W)?"
            $endgroup$
            – Poete Maudit
            yesterday










          • $begingroup$
            Yes as said in my previous answer, decision trees cannot perform feature engineering by themselves. They pick the right feature based on information gain which is called as the feature selection. So (X+W), (X*W) or any sort of simple or complex feature engineered features are not possible in case of tree based models. So answer to your second question is "No, Tree based methods cannot and will not perform feature engineering on their own". Hope it's clear
            $endgroup$
            – karthikeyan mg
            yesterday











          • $begingroup$
            Now it is significantly clearer because your starting phrase "I would say it is partly true as Random forests..." confuses things a bit. So basically at my question your answer is "no it is not true; random forest does not take into account the combination of features e.g. X+W etc". It would be good to modify a bit your post because this is not evident.
            $endgroup$
            – Poete Maudit
            yesterday










          • $begingroup$
            However, I will have to see some evidence on why the boosting algorithms do this while the bagging algorithms do not. Also, in the case of the boosting algorithms how the algorithm chooses which of the various combinations to test?
            $endgroup$
            – Poete Maudit
            yesterday










          • $begingroup$
            Thanks for the suggestion, I've made the changes. And regarding your last comment, just to be clear, random forests comes under bagging algos and gbdt, xgboost comes under boosting. I'd suggest you draft another question explaining your last comment in detail along with your thoughts and understanding and link the question here, We will try our best to help you! Cheers
            $endgroup$
            – karthikeyan mg
            yesterday















          1












          $begingroup$

          I would say it is not true as Random forests which are made up of decision trees does perform feature selection but they do not perform feature engineering (feature selection is different from feature engineering). Decision trees use a metric called Information gain (which is total entropy minus the weighted entropy) as per which useful features are separated from bad features. Simply to say whichever feature exhibit the highest information gain on this iteration is chosen as the node on which the tree on this iteration is split or you can say which feature reduces the entropy(aka randomness) the most in this iteration is chosen as the node upon which the tree is split on this iteration. So if you data is text, trees are split upon words. If your data is real valued numbers, tree is split upon that. Hope it helps



          For more details check this






          share|improve this answer











          $endgroup$












          • $begingroup$
            Thank you for your answer. However, to be honest I would like a more in depth answer. To start with, my second question is still unanswered I think: "Also if it true then is it about any kind of combination of features (e.g. X*W, X+W+Z etc) or only for some specific ones (e.g. X+W)?"
            $endgroup$
            – Poete Maudit
            yesterday










          • $begingroup$
            Yes as said in my previous answer, decision trees cannot perform feature engineering by themselves. They pick the right feature based on information gain which is called as the feature selection. So (X+W), (X*W) or any sort of simple or complex feature engineered features are not possible in case of tree based models. So answer to your second question is "No, Tree based methods cannot and will not perform feature engineering on their own". Hope it's clear
            $endgroup$
            – karthikeyan mg
            yesterday











          • $begingroup$
            Now it is significantly clearer because your starting phrase "I would say it is partly true as Random forests..." confuses things a bit. So basically at my question your answer is "no it is not true; random forest does not take into account the combination of features e.g. X+W etc". It would be good to modify a bit your post because this is not evident.
            $endgroup$
            – Poete Maudit
            yesterday










          • $begingroup$
            However, I will have to see some evidence on why the boosting algorithms do this while the bagging algorithms do not. Also, in the case of the boosting algorithms how the algorithm chooses which of the various combinations to test?
            $endgroup$
            – Poete Maudit
            yesterday










          • $begingroup$
            Thanks for the suggestion, I've made the changes. And regarding your last comment, just to be clear, random forests comes under bagging algos and gbdt, xgboost comes under boosting. I'd suggest you draft another question explaining your last comment in detail along with your thoughts and understanding and link the question here, We will try our best to help you! Cheers
            $endgroup$
            – karthikeyan mg
            yesterday













          1












          1








          1





          $begingroup$

          I would say it is not true as Random forests which are made up of decision trees does perform feature selection but they do not perform feature engineering (feature selection is different from feature engineering). Decision trees use a metric called Information gain (which is total entropy minus the weighted entropy) as per which useful features are separated from bad features. Simply to say whichever feature exhibit the highest information gain on this iteration is chosen as the node on which the tree on this iteration is split or you can say which feature reduces the entropy(aka randomness) the most in this iteration is chosen as the node upon which the tree is split on this iteration. So if you data is text, trees are split upon words. If your data is real valued numbers, tree is split upon that. Hope it helps



          For more details check this






          share|improve this answer











          $endgroup$



          I would say it is not true as Random forests which are made up of decision trees does perform feature selection but they do not perform feature engineering (feature selection is different from feature engineering). Decision trees use a metric called Information gain (which is total entropy minus the weighted entropy) as per which useful features are separated from bad features. Simply to say whichever feature exhibit the highest information gain on this iteration is chosen as the node on which the tree on this iteration is split or you can say which feature reduces the entropy(aka randomness) the most in this iteration is chosen as the node upon which the tree is split on this iteration. So if you data is text, trees are split upon words. If your data is real valued numbers, tree is split upon that. Hope it helps



          For more details check this







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited yesterday

























          answered 2 days ago









          karthikeyan mgkarthikeyan mg

          305111




          305111











          • $begingroup$
            Thank you for your answer. However, to be honest I would like a more in depth answer. To start with, my second question is still unanswered I think: "Also if it true then is it about any kind of combination of features (e.g. X*W, X+W+Z etc) or only for some specific ones (e.g. X+W)?"
            $endgroup$
            – Poete Maudit
            yesterday










          • $begingroup$
            Yes as said in my previous answer, decision trees cannot perform feature engineering by themselves. They pick the right feature based on information gain which is called as the feature selection. So (X+W), (X*W) or any sort of simple or complex feature engineered features are not possible in case of tree based models. So answer to your second question is "No, Tree based methods cannot and will not perform feature engineering on their own". Hope it's clear
            $endgroup$
            – karthikeyan mg
            yesterday











          • $begingroup$
            Now it is significantly clearer because your starting phrase "I would say it is partly true as Random forests..." confuses things a bit. So basically at my question your answer is "no it is not true; random forest does not take into account the combination of features e.g. X+W etc". It would be good to modify a bit your post because this is not evident.
            $endgroup$
            – Poete Maudit
            yesterday










          • $begingroup$
            However, I will have to see some evidence on why the boosting algorithms do this while the bagging algorithms do not. Also, in the case of the boosting algorithms how the algorithm chooses which of the various combinations to test?
            $endgroup$
            – Poete Maudit
            yesterday










          • $begingroup$
            Thanks for the suggestion, I've made the changes. And regarding your last comment, just to be clear, random forests comes under bagging algos and gbdt, xgboost comes under boosting. I'd suggest you draft another question explaining your last comment in detail along with your thoughts and understanding and link the question here, We will try our best to help you! Cheers
            $endgroup$
            – karthikeyan mg
            yesterday
















          • $begingroup$
            Thank you for your answer. However, to be honest I would like a more in depth answer. To start with, my second question is still unanswered I think: "Also if it true then is it about any kind of combination of features (e.g. X*W, X+W+Z etc) or only for some specific ones (e.g. X+W)?"
            $endgroup$
            – Poete Maudit
            yesterday










          • $begingroup$
            Yes as said in my previous answer, decision trees cannot perform feature engineering by themselves. They pick the right feature based on information gain which is called as the feature selection. So (X+W), (X*W) or any sort of simple or complex feature engineered features are not possible in case of tree based models. So answer to your second question is "No, Tree based methods cannot and will not perform feature engineering on their own". Hope it's clear
            $endgroup$
            – karthikeyan mg
            yesterday











          • $begingroup$
            Now it is significantly clearer because your starting phrase "I would say it is partly true as Random forests..." confuses things a bit. So basically at my question your answer is "no it is not true; random forest does not take into account the combination of features e.g. X+W etc". It would be good to modify a bit your post because this is not evident.
            $endgroup$
            – Poete Maudit
            yesterday










          • $begingroup$
            However, I will have to see some evidence on why the boosting algorithms do this while the bagging algorithms do not. Also, in the case of the boosting algorithms how the algorithm chooses which of the various combinations to test?
            $endgroup$
            – Poete Maudit
            yesterday










          • $begingroup$
            Thanks for the suggestion, I've made the changes. And regarding your last comment, just to be clear, random forests comes under bagging algos and gbdt, xgboost comes under boosting. I'd suggest you draft another question explaining your last comment in detail along with your thoughts and understanding and link the question here, We will try our best to help you! Cheers
            $endgroup$
            – karthikeyan mg
            yesterday















          $begingroup$
          Thank you for your answer. However, to be honest I would like a more in depth answer. To start with, my second question is still unanswered I think: "Also if it true then is it about any kind of combination of features (e.g. X*W, X+W+Z etc) or only for some specific ones (e.g. X+W)?"
          $endgroup$
          – Poete Maudit
          yesterday




          $begingroup$
          Thank you for your answer. However, to be honest I would like a more in depth answer. To start with, my second question is still unanswered I think: "Also if it true then is it about any kind of combination of features (e.g. X*W, X+W+Z etc) or only for some specific ones (e.g. X+W)?"
          $endgroup$
          – Poete Maudit
          yesterday












          $begingroup$
          Yes as said in my previous answer, decision trees cannot perform feature engineering by themselves. They pick the right feature based on information gain which is called as the feature selection. So (X+W), (X*W) or any sort of simple or complex feature engineered features are not possible in case of tree based models. So answer to your second question is "No, Tree based methods cannot and will not perform feature engineering on their own". Hope it's clear
          $endgroup$
          – karthikeyan mg
          yesterday





          $begingroup$
          Yes as said in my previous answer, decision trees cannot perform feature engineering by themselves. They pick the right feature based on information gain which is called as the feature selection. So (X+W), (X*W) or any sort of simple or complex feature engineered features are not possible in case of tree based models. So answer to your second question is "No, Tree based methods cannot and will not perform feature engineering on their own". Hope it's clear
          $endgroup$
          – karthikeyan mg
          yesterday













          $begingroup$
          Now it is significantly clearer because your starting phrase "I would say it is partly true as Random forests..." confuses things a bit. So basically at my question your answer is "no it is not true; random forest does not take into account the combination of features e.g. X+W etc". It would be good to modify a bit your post because this is not evident.
          $endgroup$
          – Poete Maudit
          yesterday




          $begingroup$
          Now it is significantly clearer because your starting phrase "I would say it is partly true as Random forests..." confuses things a bit. So basically at my question your answer is "no it is not true; random forest does not take into account the combination of features e.g. X+W etc". It would be good to modify a bit your post because this is not evident.
          $endgroup$
          – Poete Maudit
          yesterday












          $begingroup$
          However, I will have to see some evidence on why the boosting algorithms do this while the bagging algorithms do not. Also, in the case of the boosting algorithms how the algorithm chooses which of the various combinations to test?
          $endgroup$
          – Poete Maudit
          yesterday




          $begingroup$
          However, I will have to see some evidence on why the boosting algorithms do this while the bagging algorithms do not. Also, in the case of the boosting algorithms how the algorithm chooses which of the various combinations to test?
          $endgroup$
          – Poete Maudit
          yesterday












          $begingroup$
          Thanks for the suggestion, I've made the changes. And regarding your last comment, just to be clear, random forests comes under bagging algos and gbdt, xgboost comes under boosting. I'd suggest you draft another question explaining your last comment in detail along with your thoughts and understanding and link the question here, We will try our best to help you! Cheers
          $endgroup$
          – karthikeyan mg
          yesterday




          $begingroup$
          Thanks for the suggestion, I've made the changes. And regarding your last comment, just to be clear, random forests comes under bagging algos and gbdt, xgboost comes under boosting. I'd suggest you draft another question explaining your last comment in detail along with your thoughts and understanding and link the question here, We will try our best to help you! Cheers
          $endgroup$
          – karthikeyan mg
          yesterday











          0












          $begingroup$

          No, it is not true.



          In Random Forest (or Decision Tree, or Regression Tree), individual features are compared to each other, not a combination of them, then the most informative individual is peaked to split a node. Therefore, there is no notion of "better combination" in the whole process.



          Furthermore, Random Forest is a bagging algorithm which does not favor the randomly-built trees (or their sub-trees) over each other, they all have the same weight in the aggregated output.



          It is worth noting that "Rotation forest" first applies PCA to features, which means each new feature is a linear combination of original features. However, this does not count since the same pre-processing can be used for any other method too.



          Here is a quote from this wiki page for more details:




          A decision tree is a flow-chart-like structure, where each internal
          (non-leaf) node denotes a test on an attribute.




          EDIT:



          @tam provided a nice counter-example for XGBoost, which is not the same as Random Forest. That is, tree boosting algorithms have a notion of "better combination" that, for example, favors (in regression) $f(X, Y, W)=X*Y+textexp(X+Y) + 0 times W$ over $g(X, Y, W)=0 times X + Y+W$ by placing more weight on $f$. Note that a sub-tree, or a complete tree represents a function $f(boldsymbolX=(X,Y,Z,...))$ over a specific region $R subset (Bbb X, Bbb Y, Bbb Z,...)$ in the feature space.



          In XGBoost, without going into details, more weight is put on a function that leads to more decrease in the overall error, loosely similar to AdaBoost algorithm (XGBoost in more detail, Adaboost vs XGBoost). This is equivalent to favoring a complicated-combination-of-features over another one.



          Note that, a tree can approximate any continuous function $f$ over training points, since it is a universal approximator just like neural networks.






          share|improve this answer











          $endgroup$












          • $begingroup$
            Thank you for your answer. My post triggered some opposing views and now in this sense I do not know yet which side to take. By the way, my impression is that the remark of @tam is not really directly to the point. The fact that tree boosting algorithms favor f(X, Y) over g(Y, W) does not necessarily mean that they take into account the combination of the features in the sense of e.g. X+W but they simply favor groups of features over other groups of features. Thus, not combination of features but groups of features (if I am not missing anything).
            $endgroup$
            – Poete Maudit
            yesterday










          • $begingroup$
            @PoeteMaudit I added an example.
            $endgroup$
            – Esmailian
            yesterday










          • $begingroup$
            Cool, thank you. However, I will have to see some evidence on why the boosting algorithms do this while the bagging algorithms do not. Also, in the case of the boosting algorithms how the algorithm chooses which of the various combinations to test?
            $endgroup$
            – Poete Maudit
            yesterday






          • 1




            $begingroup$
            So your answer to my question is that "Note that, a tree can approximate any continuous function f over training points, since it is a universal approximator just like neural networks."? If so then this is interesting.
            $endgroup$
            – Poete Maudit
            yesterday















          0












          $begingroup$

          No, it is not true.



          In Random Forest (or Decision Tree, or Regression Tree), individual features are compared to each other, not a combination of them, then the most informative individual is peaked to split a node. Therefore, there is no notion of "better combination" in the whole process.



          Furthermore, Random Forest is a bagging algorithm which does not favor the randomly-built trees (or their sub-trees) over each other, they all have the same weight in the aggregated output.



          It is worth noting that "Rotation forest" first applies PCA to features, which means each new feature is a linear combination of original features. However, this does not count since the same pre-processing can be used for any other method too.



          Here is a quote from this wiki page for more details:




          A decision tree is a flow-chart-like structure, where each internal
          (non-leaf) node denotes a test on an attribute.




          EDIT:



          @tam provided a nice counter-example for XGBoost, which is not the same as Random Forest. That is, tree boosting algorithms have a notion of "better combination" that, for example, favors (in regression) $f(X, Y, W)=X*Y+textexp(X+Y) + 0 times W$ over $g(X, Y, W)=0 times X + Y+W$ by placing more weight on $f$. Note that a sub-tree, or a complete tree represents a function $f(boldsymbolX=(X,Y,Z,...))$ over a specific region $R subset (Bbb X, Bbb Y, Bbb Z,...)$ in the feature space.



          In XGBoost, without going into details, more weight is put on a function that leads to more decrease in the overall error, loosely similar to AdaBoost algorithm (XGBoost in more detail, Adaboost vs XGBoost). This is equivalent to favoring a complicated-combination-of-features over another one.



          Note that, a tree can approximate any continuous function $f$ over training points, since it is a universal approximator just like neural networks.






          share|improve this answer











          $endgroup$












          • $begingroup$
            Thank you for your answer. My post triggered some opposing views and now in this sense I do not know yet which side to take. By the way, my impression is that the remark of @tam is not really directly to the point. The fact that tree boosting algorithms favor f(X, Y) over g(Y, W) does not necessarily mean that they take into account the combination of the features in the sense of e.g. X+W but they simply favor groups of features over other groups of features. Thus, not combination of features but groups of features (if I am not missing anything).
            $endgroup$
            – Poete Maudit
            yesterday










          • $begingroup$
            @PoeteMaudit I added an example.
            $endgroup$
            – Esmailian
            yesterday










          • $begingroup$
            Cool, thank you. However, I will have to see some evidence on why the boosting algorithms do this while the bagging algorithms do not. Also, in the case of the boosting algorithms how the algorithm chooses which of the various combinations to test?
            $endgroup$
            – Poete Maudit
            yesterday






          • 1




            $begingroup$
            So your answer to my question is that "Note that, a tree can approximate any continuous function f over training points, since it is a universal approximator just like neural networks."? If so then this is interesting.
            $endgroup$
            – Poete Maudit
            yesterday













          0












          0








          0





          $begingroup$

          No, it is not true.



          In Random Forest (or Decision Tree, or Regression Tree), individual features are compared to each other, not a combination of them, then the most informative individual is peaked to split a node. Therefore, there is no notion of "better combination" in the whole process.



          Furthermore, Random Forest is a bagging algorithm which does not favor the randomly-built trees (or their sub-trees) over each other, they all have the same weight in the aggregated output.



          It is worth noting that "Rotation forest" first applies PCA to features, which means each new feature is a linear combination of original features. However, this does not count since the same pre-processing can be used for any other method too.



          Here is a quote from this wiki page for more details:




          A decision tree is a flow-chart-like structure, where each internal
          (non-leaf) node denotes a test on an attribute.




          EDIT:



          @tam provided a nice counter-example for XGBoost, which is not the same as Random Forest. That is, tree boosting algorithms have a notion of "better combination" that, for example, favors (in regression) $f(X, Y, W)=X*Y+textexp(X+Y) + 0 times W$ over $g(X, Y, W)=0 times X + Y+W$ by placing more weight on $f$. Note that a sub-tree, or a complete tree represents a function $f(boldsymbolX=(X,Y,Z,...))$ over a specific region $R subset (Bbb X, Bbb Y, Bbb Z,...)$ in the feature space.



          In XGBoost, without going into details, more weight is put on a function that leads to more decrease in the overall error, loosely similar to AdaBoost algorithm (XGBoost in more detail, Adaboost vs XGBoost). This is equivalent to favoring a complicated-combination-of-features over another one.



          Note that, a tree can approximate any continuous function $f$ over training points, since it is a universal approximator just like neural networks.






          share|improve this answer











          $endgroup$



          No, it is not true.



          In Random Forest (or Decision Tree, or Regression Tree), individual features are compared to each other, not a combination of them, then the most informative individual is peaked to split a node. Therefore, there is no notion of "better combination" in the whole process.



          Furthermore, Random Forest is a bagging algorithm which does not favor the randomly-built trees (or their sub-trees) over each other, they all have the same weight in the aggregated output.



          It is worth noting that "Rotation forest" first applies PCA to features, which means each new feature is a linear combination of original features. However, this does not count since the same pre-processing can be used for any other method too.



          Here is a quote from this wiki page for more details:




          A decision tree is a flow-chart-like structure, where each internal
          (non-leaf) node denotes a test on an attribute.




          EDIT:



          @tam provided a nice counter-example for XGBoost, which is not the same as Random Forest. That is, tree boosting algorithms have a notion of "better combination" that, for example, favors (in regression) $f(X, Y, W)=X*Y+textexp(X+Y) + 0 times W$ over $g(X, Y, W)=0 times X + Y+W$ by placing more weight on $f$. Note that a sub-tree, or a complete tree represents a function $f(boldsymbolX=(X,Y,Z,...))$ over a specific region $R subset (Bbb X, Bbb Y, Bbb Z,...)$ in the feature space.



          In XGBoost, without going into details, more weight is put on a function that leads to more decrease in the overall error, loosely similar to AdaBoost algorithm (XGBoost in more detail, Adaboost vs XGBoost). This is equivalent to favoring a complicated-combination-of-features over another one.



          Note that, a tree can approximate any continuous function $f$ over training points, since it is a universal approximator just like neural networks.







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited yesterday

























          answered 2 days ago









          EsmailianEsmailian

          2,487318




          2,487318











          • $begingroup$
            Thank you for your answer. My post triggered some opposing views and now in this sense I do not know yet which side to take. By the way, my impression is that the remark of @tam is not really directly to the point. The fact that tree boosting algorithms favor f(X, Y) over g(Y, W) does not necessarily mean that they take into account the combination of the features in the sense of e.g. X+W but they simply favor groups of features over other groups of features. Thus, not combination of features but groups of features (if I am not missing anything).
            $endgroup$
            – Poete Maudit
            yesterday










          • $begingroup$
            @PoeteMaudit I added an example.
            $endgroup$
            – Esmailian
            yesterday










          • $begingroup$
            Cool, thank you. However, I will have to see some evidence on why the boosting algorithms do this while the bagging algorithms do not. Also, in the case of the boosting algorithms how the algorithm chooses which of the various combinations to test?
            $endgroup$
            – Poete Maudit
            yesterday






          • 1




            $begingroup$
            So your answer to my question is that "Note that, a tree can approximate any continuous function f over training points, since it is a universal approximator just like neural networks."? If so then this is interesting.
            $endgroup$
            – Poete Maudit
            yesterday
















          • $begingroup$
            Thank you for your answer. My post triggered some opposing views and now in this sense I do not know yet which side to take. By the way, my impression is that the remark of @tam is not really directly to the point. The fact that tree boosting algorithms favor f(X, Y) over g(Y, W) does not necessarily mean that they take into account the combination of the features in the sense of e.g. X+W but they simply favor groups of features over other groups of features. Thus, not combination of features but groups of features (if I am not missing anything).
            $endgroup$
            – Poete Maudit
            yesterday










          • $begingroup$
            @PoeteMaudit I added an example.
            $endgroup$
            – Esmailian
            yesterday










          • $begingroup$
            Cool, thank you. However, I will have to see some evidence on why the boosting algorithms do this while the bagging algorithms do not. Also, in the case of the boosting algorithms how the algorithm chooses which of the various combinations to test?
            $endgroup$
            – Poete Maudit
            yesterday






          • 1




            $begingroup$
            So your answer to my question is that "Note that, a tree can approximate any continuous function f over training points, since it is a universal approximator just like neural networks."? If so then this is interesting.
            $endgroup$
            – Poete Maudit
            yesterday















          $begingroup$
          Thank you for your answer. My post triggered some opposing views and now in this sense I do not know yet which side to take. By the way, my impression is that the remark of @tam is not really directly to the point. The fact that tree boosting algorithms favor f(X, Y) over g(Y, W) does not necessarily mean that they take into account the combination of the features in the sense of e.g. X+W but they simply favor groups of features over other groups of features. Thus, not combination of features but groups of features (if I am not missing anything).
          $endgroup$
          – Poete Maudit
          yesterday




          $begingroup$
          Thank you for your answer. My post triggered some opposing views and now in this sense I do not know yet which side to take. By the way, my impression is that the remark of @tam is not really directly to the point. The fact that tree boosting algorithms favor f(X, Y) over g(Y, W) does not necessarily mean that they take into account the combination of the features in the sense of e.g. X+W but they simply favor groups of features over other groups of features. Thus, not combination of features but groups of features (if I am not missing anything).
          $endgroup$
          – Poete Maudit
          yesterday












          $begingroup$
          @PoeteMaudit I added an example.
          $endgroup$
          – Esmailian
          yesterday




          $begingroup$
          @PoeteMaudit I added an example.
          $endgroup$
          – Esmailian
          yesterday












          $begingroup$
          Cool, thank you. However, I will have to see some evidence on why the boosting algorithms do this while the bagging algorithms do not. Also, in the case of the boosting algorithms how the algorithm chooses which of the various combinations to test?
          $endgroup$
          – Poete Maudit
          yesterday




          $begingroup$
          Cool, thank you. However, I will have to see some evidence on why the boosting algorithms do this while the bagging algorithms do not. Also, in the case of the boosting algorithms how the algorithm chooses which of the various combinations to test?
          $endgroup$
          – Poete Maudit
          yesterday




          1




          1




          $begingroup$
          So your answer to my question is that "Note that, a tree can approximate any continuous function f over training points, since it is a universal approximator just like neural networks."? If so then this is interesting.
          $endgroup$
          – Poete Maudit
          yesterday




          $begingroup$
          So your answer to my question is that "Note that, a tree can approximate any continuous function f over training points, since it is a universal approximator just like neural networks."? If so then this is interesting.
          $endgroup$
          – Poete Maudit
          yesterday

















          draft saved

          draft discarded
















































          Thanks for contributing an answer to Data Science Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid


          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.

          Use MathJax to format equations. MathJax reference.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f48294%2fregression-vs-random-forest-combination-of-features%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          getting Checkpoint VPN SSL Network Extender working in the command lineHow to connect to CheckPoint VPN on Ubuntu 18.04LTS?Will the Linux ( red-hat ) Open VPNC Client connect to checkpoint or nortel VPN gateways?VPN client for linux machine + support checkpoint gatewayVPN SSL Network Extender in FirefoxLinux Checkpoint SNX tool configuration issuesCheck Point - Connect under Linux - snx + OTPSNX VPN Ububuntu 18.XXUsing Checkpoint VPN SSL Network Extender CLI with certificateVPN with network manager (nm-applet) is not workingWill the Linux ( red-hat ) Open VPNC Client connect to checkpoint or nortel VPN gateways?VPN client for linux machine + support checkpoint gatewayImport VPN config files to NetworkManager from command lineTrouble connecting to VPN using network-manager, while command line worksStart a VPN connection with PPTP protocol on command linestarting a docker service daemon breaks the vpn networkCan't connect to vpn with Network-managerVPN SSL Network Extender in FirefoxUsing Checkpoint VPN SSL Network Extender CLI with certificate

          Cannot Extend partition with GParted The 2019 Stack Overflow Developer Survey Results Are In Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern) 2019 Community Moderator Election ResultsCan't increase partition size with GParted?GParted doesn't recognize the unallocated space after my current partitionWhat is the best way to add unallocated space located before to Ubuntu 12.04 partition with GParted live?I can't figure out how to extend my Arch home partition into free spaceGparted Linux Mint 18.1 issueTrying to extend but swap partition is showing as Unknown in Gparted, shows proper from fdiskRearrange partitions in gparted to extend a partitionUnable to extend partition even though unallocated space is next to it using GPartedAllocate free space to root partitiongparted: how to merge unallocated space with a partition

          Marilyn Monroe Ny fiainany manokana | Jereo koa | Meny fitetezanafanitarana azy.