Categorical vs continuous feature selection/engineering Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern) 2019 Moderator Election Q&A - Questionnaire 2019 Community Moderator Election ResultsVisualizing Support Vector Machines (SVM) with Multiple Explanatory VariablesPredicting a Continuous output in a dataset with categoriesHow to perform Logistic Regression with a large number of features?Data balance -before or after feature selection/engineeringLSTM Feature selection processChi-squared for continuous variablesBest practices for selecting categorical featuresHierarchical Clustering and Variable SelectionTarget Encoding: missing value imputation before or after encodingManual feature engineering based on the output

Withdrew £2800, but only £2000 shows as withdrawn on online banking; what are my obligations?

Using Random Forest variable importance for feature selection

How discoverable are IPv6 addresses and AAAA names by potential attackers?

Can I cast Passwall to drop an enemy into a 20-foot pit?

Identifying polygons that intersect with another layer using QGIS?

Should I discuss the type of campaign with my players?

Sci-Fi book where patients in a coma ward all live in a subconscious world linked together

How to tell that you are a giant?

Why am I getting the error "non-boolean type specified in a context where a condition is expected" for this request?

How does the particle を relate to the verb 行く in the structure「A を + B に行く」?

Using audio cues to encourage good posture

Can a non-EU citizen traveling with me come with me through the EU passport line?

Why do people hide their license plates in the EU?

Bete Noir -- no dairy

Is there a node or combination of nodes that can take an average colour out of a single image?

How can I (re)show post-installation notes?

Single word antonym of "flightless"

String `!23` is replaced with `docker` in command line

What's the purpose of writing one's academic biography in the third person?

Book where humans were engineered with genes from animal species to survive hostile planets

porting install scripts : can rpm replace apt?

How to react to hostile behavior from a senior developer?

What is the meaning of the new sigil in Game of Thrones Season 8 intro?

Is pollution the main cause of Notre Dame Cathedral's deterioration?

Categorical vs continuous feature selection/engineering

Announcing the arrival of Valued Associate #679: Cesar Manara

Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern)

2019 Moderator Election Q&A - Questionnaire

2019 Community Moderator Election ResultsVisualizing Support Vector Machines (SVM) with Multiple Explanatory VariablesPredicting a Continuous output in a dataset with categoriesHow to perform Logistic Regression with a large number of features?Data balance -before or after feature selection/engineeringLSTM Feature selection processChi-squared for continuous variablesBest practices for selecting categorical featuresHierarchical Clustering and Variable SelectionTarget Encoding: missing value imputation before or after encodingManual feature engineering based on the output

I'm working with a dataset with a number of potential predictors like :

Age : continuous

Number of children : discrete and numerical

Marital Situation : Categorical ( Married/Single/Divorced.. )

Id_User : Categorical ( an id of the user who conducted the first interview with this person )

I'm stopping at four potential predictors, there are more, but for the sake of shortness, these would be enough to ask my question.

Question : Continuous features are easy to deal with, normalize, and feed it to the model, what about categorical and independant ?

Note : I get that categorical features that follow a certain pattern can be encoded as integers and fed to the model, but what if those categorical features have no meaning as integers ( 1 for single, 2 for married , 3 for divorced ; for the model that treats it as a quantitative predictor it doesn't make sense to feed it to it like that)

Any ways to deal with these different types of features?

edited Apr 12 at 10:57

asked Apr 12 at 10:17

Blenzus

16910

add a comment |

I'm working with a dataset with a number of potential predictors like :

Age : continuous

Number of children : discrete and numerical

Marital Situation : Categorical ( Married/Single/Divorced.. )

Id_User : Categorical ( an id of the user who conducted the first interview with this person )

I'm stopping at four potential predictors, there are more, but for the sake of shortness, these would be enough to ask my question.

Question : Continuous features are easy to deal with, normalize, and feed it to the model, what about categorical and independant ?

Any ways to deal with these different types of features?

edited Apr 12 at 10:57

asked Apr 12 at 10:17

Blenzus

16910

add a comment |

I'm working with a dataset with a number of potential predictors like :

Age : continuous

Number of children : discrete and numerical

Marital Situation : Categorical ( Married/Single/Divorced.. )

Id_User : Categorical ( an id of the user who conducted the first interview with this person )

I'm stopping at four potential predictors, there are more, but for the sake of shortness, these would be enough to ask my question.

Question : Continuous features are easy to deal with, normalize, and feed it to the model, what about categorical and independant ?

Any ways to deal with these different types of features?

edited Apr 12 at 10:57

asked Apr 12 at 10:17

Blenzus

16910

I'm working with a dataset with a number of potential predictors like :

Age : continuous

Number of children : discrete and numerical

Marital Situation : Categorical ( Married/Single/Divorced.. )

Id_User : Categorical ( an id of the user who conducted the first interview with this person )

I'm stopping at four potential predictors, there are more, but for the sake of shortness, these would be enough to ask my question.

Question : Continuous features are easy to deal with, normalize, and feed it to the model, what about categorical and independant ?

Any ways to deal with these different types of features?

machine-learning feature-selection feature-engineering

edited Apr 12 at 10:57

asked Apr 12 at 10:17

Blenzus

16910

edited Apr 12 at 10:57

asked Apr 12 at 10:17

Blenzus

16910

edited Apr 12 at 10:57

asked Apr 12 at 10:17

Blenzus

16910

asked Apr 12 at 10:17

Blenzus

16910

asked Apr 12 at 10:17

Blenzus

16910

add a comment |

5 Answers
5

active

oldest

votes

What you are looking for are called dummy variables, they convert your categorical data into a matrix where the column is 1 if the person belongs to a category or 0 otherwise.

The variable ID is not convertible because you don't want your model to overfit over your ID data (meaning: You don't want your model to remember the result for every ID, you want your model to be general).

import pandas as pd
dataset2 = pd.get_dummies(dataset)

edited Apr 13 at 13:12

Stephen Rauch♦

1,52551330

answered Apr 12 at 12:19

Juan Esteban de la Calle

34811

New contributor

add a comment |

For encoding categorical features, there is two common ways:

Ordinal encoder

This is the way you mentioned as 'encoded as integers'. In this method, an integer starting from 0 is assigned to each category. The problem of this method is that it randomly prioritize categories. So in cases when there is no priority among categories, this encoding is meaningless as you mentioned. The only case it work is when assigning larger integer to some categories is meaningful.

One-hot encoder

This method makes a feature vector (one-hot vector) for each categorical feature which is the same size as the number of categories. The method assigns each component of the vector to one of the categories. For each data sample, it assigns 1 to component which its corresponding category is present at the sample and assigns 0 to other components. The benefit of this method is that unlike ordinal encoder it does not prioritize any category.

So in your case, I highly recommend that you use one-hot encoder.

answered Apr 12 at 11:29

pythinker

8291213

$begingroup$
The number of columns 'One-Hot' adds to the dataset doesn't affect in anyway the outcome of my model right?
$endgroup$
– Blenzus
Apr 12 at 13:46

$begingroup$
Are you afraid of overfitting?
$endgroup$
– pythinker
Apr 12 at 13:51

$begingroup$
aren't we all?! I'm under the impression that ,on the contrary, this method doesn't 'encourage' overfitting
$endgroup$
– Blenzus
Apr 12 at 13:53

$begingroup$
Yes, you are right. We all are afraid of over-fitting. By over-fitting I meant, when we increase the number of inputs the model have to learn more weights to map this inputs to outputs. So, I should say, it somehow affects the outcome of your model but it's not a serious concern.
$endgroup$
– pythinker
Apr 12 at 13:59

1

$begingroup$
I believe that in the context of machine learning, "dummy variable" is more commonly used for what you are referring to as "one-hot".
$endgroup$
– Acccumulation
Apr 12 at 15:21

add a comment |

One possibility to deal with categorical inputs is to introduce the category input vector $boldsymbolt$. The category input vector of the $n^textth$ observation is given by

$boldsymbolt_n=[t_1n, t_2n,...,t_Kn],$ in which $K$ is the number of categories. If the continuous input vector $boldsymbolx_n$ is belonging to category $k$, then $t_1i=1$ for $i=k$ and $t_1i=0$ for $ineq k$.

This type of encoding is called one hot encoding for classification.

answered Apr 12 at 11:02

MachineLearner

399110

1

$begingroup$
I have a lot of possible values 100+ in let's say Id_User, wouldn't that add 100 additional columns to my dataset?
$endgroup$
– Blenzus
Apr 12 at 11:08

1

$begingroup$
@Blenzus: Yes you are right, but the columns are sparse. You have to remember that having so many categories is only feasible if you have a lot of data such that your data set is representative.
$endgroup$
– MachineLearner
Apr 12 at 12:49

add a comment |

As others have said, dummy variables is one method. Another method is to take quantitative statistics from the populations having that property. For instance, you can create a "marital situation average" column, and populate it with the average value of the target variable among people with the same marital situation as that subject.

If you are using a tree method, simply assigning integers to each category will approximate dummy variables, especially if there are only a few categories. For instance, if the only categories for marital situation are Married, Single, Divorced, and Widowed, and you assign them 0, 1, 2, 3 respectively, then the only possible splits are Married vs. Everything else, Widowed vs. Everything else, or Married/Single vs Divorced/Widowed. So two thirds of the splits are effectively dummy variables, and the last one will turn into a dummy variable as soon as you split on that variable again.

answered Apr 12 at 15:36

Acccumulation

1311

add a comment |

There could be a number of ways of handling categorical data but what I have seen so far is to create a numeral mapping of the categorical data and then one-hot-encode the mappings to feed into the neural network.

If you are working with Keras, you can use the to_categorical function to transform your mappings accordingly.

>>> from keras.utils import to_categorical

>>> y = [0,1,0,1,1]
>>> oh_y = to_categorial(y, num_classes=2)
>>> print(oh_y)

[[1,0],[0,1],[1,0],[0,1],[0,1]]
```

answered Apr 13 at 9:27

thanatoz

643421

$begingroup$
Thanks for the answer, no , actually i'm using "regular" classification algorithms , but yes i've used the to_categorical method while testing an ANN.
$endgroup$
– Blenzus
yesterday

add a comment |

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f49185%2fcategorical-vs-continuous-feature-selection-engineering%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

5 Answers
5

active

oldest

votes

5 Answers
5

active

oldest

votes

What you are looking for are called dummy variables, they convert your categorical data into a matrix where the column is 1 if the person belongs to a category or 0 otherwise.

import pandas as pd
dataset2 = pd.get_dummies(dataset)

edited Apr 13 at 13:12

Stephen Rauch♦

1,52551330

answered Apr 12 at 12:19

Juan Esteban de la Calle

34811

New contributor

add a comment |

What you are looking for are called dummy variables, they convert your categorical data into a matrix where the column is 1 if the person belongs to a category or 0 otherwise.

import pandas as pd
dataset2 = pd.get_dummies(dataset)

edited Apr 13 at 13:12

Stephen Rauch♦

1,52551330

answered Apr 12 at 12:19

Juan Esteban de la Calle

34811

New contributor

add a comment |

What you are looking for are called dummy variables, they convert your categorical data into a matrix where the column is 1 if the person belongs to a category or 0 otherwise.

import pandas as pd
dataset2 = pd.get_dummies(dataset)

edited Apr 13 at 13:12

Stephen Rauch♦

1,52551330

answered Apr 12 at 12:19

Juan Esteban de la Calle

34811

New contributor

What you are looking for are called dummy variables, they convert your categorical data into a matrix where the column is 1 if the person belongs to a category or 0 otherwise.

import pandas as pd
dataset2 = pd.get_dummies(dataset)

edited Apr 13 at 13:12

Stephen Rauch♦

1,52551330

answered Apr 12 at 12:19

Juan Esteban de la Calle

34811

New contributor

edited Apr 13 at 13:12

Stephen Rauch♦

1,52551330

edited Apr 13 at 13:12

Stephen Rauch♦

1,52551330

edited Apr 13 at 13:12

Stephen Rauch♦

1,52551330

answered Apr 12 at 12:19

Juan Esteban de la Calle

34811

New contributor

answered Apr 12 at 12:19

Juan Esteban de la Calle

34811

answered Apr 12 at 12:19

Juan Esteban de la Calle

34811

New contributor

Juan Esteban de la Calle is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

add a comment |

For encoding categorical features, there is two common ways:

Ordinal encoder

One-hot encoder

So in your case, I highly recommend that you use one-hot encoder.

answered Apr 12 at 11:29

pythinker

8291213

$begingroup$
The number of columns 'One-Hot' adds to the dataset doesn't affect in anyway the outcome of my model right?
$endgroup$
– Blenzus
Apr 12 at 13:46

$begingroup$
Are you afraid of overfitting?
$endgroup$
– pythinker
Apr 12 at 13:51

$begingroup$
aren't we all?! I'm under the impression that ,on the contrary, this method doesn't 'encourage' overfitting
$endgroup$
– Blenzus
Apr 12 at 13:53

$begingroup$
Yes, you are right. We all are afraid of over-fitting. By over-fitting I meant, when we increase the number of inputs the model have to learn more weights to map this inputs to outputs. So, I should say, it somehow affects the outcome of your model but it's not a serious concern.
$endgroup$
– pythinker
Apr 12 at 13:59

1

$begingroup$
I believe that in the context of machine learning, "dummy variable" is more commonly used for what you are referring to as "one-hot".
$endgroup$
– Acccumulation
Apr 12 at 15:21

add a comment |

For encoding categorical features, there is two common ways:

Ordinal encoder

One-hot encoder

So in your case, I highly recommend that you use one-hot encoder.

answered Apr 12 at 11:29

pythinker

8291213

$begingroup$
The number of columns 'One-Hot' adds to the dataset doesn't affect in anyway the outcome of my model right?
$endgroup$
– Blenzus
Apr 12 at 13:46

$begingroup$
Are you afraid of overfitting?
$endgroup$
– pythinker
Apr 12 at 13:51

$begingroup$
aren't we all?! I'm under the impression that ,on the contrary, this method doesn't 'encourage' overfitting
$endgroup$
– Blenzus
Apr 12 at 13:53

$begingroup$
Yes, you are right. We all are afraid of over-fitting. By over-fitting I meant, when we increase the number of inputs the model have to learn more weights to map this inputs to outputs. So, I should say, it somehow affects the outcome of your model but it's not a serious concern.
$endgroup$
– pythinker
Apr 12 at 13:59

1

$begingroup$
I believe that in the context of machine learning, "dummy variable" is more commonly used for what you are referring to as "one-hot".
$endgroup$
– Acccumulation
Apr 12 at 15:21

add a comment |

For encoding categorical features, there is two common ways:

Ordinal encoder

One-hot encoder

So in your case, I highly recommend that you use one-hot encoder.

answered Apr 12 at 11:29

pythinker

8291213

For encoding categorical features, there is two common ways:

Ordinal encoder

One-hot encoder

So in your case, I highly recommend that you use one-hot encoder.

answered Apr 12 at 11:29

pythinker

8291213

answered Apr 12 at 11:29

pythinker

8291213

answered Apr 12 at 11:29

pythinker

8291213

answered Apr 12 at 11:29

pythinker

8291213

$begingroup$
The number of columns 'One-Hot' adds to the dataset doesn't affect in anyway the outcome of my model right?
$endgroup$
– Blenzus
Apr 12 at 13:46

$begingroup$
Are you afraid of overfitting?
$endgroup$
– pythinker
Apr 12 at 13:51

$begingroup$
aren't we all?! I'm under the impression that ,on the contrary, this method doesn't 'encourage' overfitting
$endgroup$
– Blenzus
Apr 12 at 13:53

$begingroup$
Yes, you are right. We all are afraid of over-fitting. By over-fitting I meant, when we increase the number of inputs the model have to learn more weights to map this inputs to outputs. So, I should say, it somehow affects the outcome of your model but it's not a serious concern.
$endgroup$
– pythinker
Apr 12 at 13:59

1

$begingroup$
I believe that in the context of machine learning, "dummy variable" is more commonly used for what you are referring to as "one-hot".
$endgroup$
– Acccumulation
Apr 12 at 15:21

add a comment |

$begingroup$
The number of columns 'One-Hot' adds to the dataset doesn't affect in anyway the outcome of my model right?
$endgroup$
– Blenzus
Apr 12 at 13:46

$begingroup$
Are you afraid of overfitting?
$endgroup$
– pythinker
Apr 12 at 13:51

$begingroup$
aren't we all?! I'm under the impression that ,on the contrary, this method doesn't 'encourage' overfitting
$endgroup$
– Blenzus
Apr 12 at 13:53

$begingroup$
Yes, you are right. We all are afraid of over-fitting. By over-fitting I meant, when we increase the number of inputs the model have to learn more weights to map this inputs to outputs. So, I should say, it somehow affects the outcome of your model but it's not a serious concern.
$endgroup$
– pythinker
Apr 12 at 13:59

1

$begingroup$
I believe that in the context of machine learning, "dummy variable" is more commonly used for what you are referring to as "one-hot".
$endgroup$
– Acccumulation
Apr 12 at 15:21

The number of columns 'One-Hot' adds to the dataset doesn't affect in anyway the outcome of my model right?

– Blenzus
Apr 12 at 13:46

Are you afraid of overfitting?

– pythinker
Apr 12 at 13:51

aren't we all?! I'm under the impression that ,on the contrary, this method doesn't 'encourage' overfitting

– Blenzus
Apr 12 at 13:53

Yes, you are right. We all are afraid of over-fitting. By over-fitting I meant, when we increase the number of inputs the model have to learn more weights to map this inputs to outputs. So, I should say, it somehow affects the outcome of your model but it's not a serious concern.

– pythinker
Apr 12 at 13:59

I believe that in the context of machine learning, "dummy variable" is more commonly used for what you are referring to as "one-hot".

– Acccumulation
Apr 12 at 15:21

add a comment |

One possibility to deal with categorical inputs is to introduce the category input vector $boldsymbolt$. The category input vector of the $n^textth$ observation is given by

This type of encoding is called one hot encoding for classification.

answered Apr 12 at 11:02

MachineLearner

399110

1

$begingroup$
I have a lot of possible values 100+ in let's say Id_User, wouldn't that add 100 additional columns to my dataset?
$endgroup$
– Blenzus
Apr 12 at 11:08

1

$begingroup$
@Blenzus: Yes you are right, but the columns are sparse. You have to remember that having so many categories is only feasible if you have a lot of data such that your data set is representative.
$endgroup$
– MachineLearner
Apr 12 at 12:49

add a comment |

One possibility to deal with categorical inputs is to introduce the category input vector $boldsymbolt$. The category input vector of the $n^textth$ observation is given by

This type of encoding is called one hot encoding for classification.

answered Apr 12 at 11:02

MachineLearner

399110

1

$begingroup$
I have a lot of possible values 100+ in let's say Id_User, wouldn't that add 100 additional columns to my dataset?
$endgroup$
– Blenzus
Apr 12 at 11:08

1

$begingroup$
@Blenzus: Yes you are right, but the columns are sparse. You have to remember that having so many categories is only feasible if you have a lot of data such that your data set is representative.
$endgroup$
– MachineLearner
Apr 12 at 12:49

add a comment |

One possibility to deal with categorical inputs is to introduce the category input vector $boldsymbolt$. The category input vector of the $n^textth$ observation is given by

This type of encoding is called one hot encoding for classification.

answered Apr 12 at 11:02

MachineLearner

399110

One possibility to deal with categorical inputs is to introduce the category input vector $boldsymbolt$. The category input vector of the $n^textth$ observation is given by

This type of encoding is called one hot encoding for classification.

answered Apr 12 at 11:02

MachineLearner

399110

answered Apr 12 at 11:02

MachineLearner

399110

answered Apr 12 at 11:02

MachineLearner

399110

answered Apr 12 at 11:02

MachineLearner

399110

1

$begingroup$
I have a lot of possible values 100+ in let's say Id_User, wouldn't that add 100 additional columns to my dataset?
$endgroup$
– Blenzus
Apr 12 at 11:08

1

$begingroup$
@Blenzus: Yes you are right, but the columns are sparse. You have to remember that having so many categories is only feasible if you have a lot of data such that your data set is representative.
$endgroup$
– MachineLearner
Apr 12 at 12:49

add a comment |

1

$begingroup$
I have a lot of possible values 100+ in let's say Id_User, wouldn't that add 100 additional columns to my dataset?
$endgroup$
– Blenzus
Apr 12 at 11:08

1

$begingroup$
@Blenzus: Yes you are right, but the columns are sparse. You have to remember that having so many categories is only feasible if you have a lot of data such that your data set is representative.
$endgroup$
– MachineLearner
Apr 12 at 12:49

I have a lot of possible values 100+ in let's say Id_User, wouldn't that add 100 additional columns to my dataset?

– Blenzus
Apr 12 at 11:08

@Blenzus: Yes you are right, but the columns are sparse. You have to remember that having so many categories is only feasible if you have a lot of data such that your data set is representative.

– MachineLearner
Apr 12 at 12:49

add a comment |

answered Apr 12 at 15:36

Acccumulation

1311

add a comment |

answered Apr 12 at 15:36

Acccumulation

1311

add a comment |

answered Apr 12 at 15:36

Acccumulation

1311

answered Apr 12 at 15:36

Acccumulation

1311

answered Apr 12 at 15:36

Acccumulation

1311

answered Apr 12 at 15:36

Acccumulation

1311

answered Apr 12 at 15:36

Acccumulation

1311

add a comment |

If you are working with Keras, you can use the to_categorical function to transform your mappings accordingly.

>>> from keras.utils import to_categorical

>>> y = [0,1,0,1,1]
>>> oh_y = to_categorial(y, num_classes=2)
>>> print(oh_y)

[[1,0],[0,1],[1,0],[0,1],[0,1]]
```

answered Apr 13 at 9:27

thanatoz

643421

$begingroup$
Thanks for the answer, no , actually i'm using "regular" classification algorithms , but yes i've used the to_categorical method while testing an ANN.
$endgroup$
– Blenzus
yesterday

add a comment |

If you are working with Keras, you can use the to_categorical function to transform your mappings accordingly.

>>> from keras.utils import to_categorical

>>> y = [0,1,0,1,1]
>>> oh_y = to_categorial(y, num_classes=2)
>>> print(oh_y)

[[1,0],[0,1],[1,0],[0,1],[0,1]]
```

answered Apr 13 at 9:27

thanatoz

643421

$begingroup$
Thanks for the answer, no , actually i'm using "regular" classification algorithms , but yes i've used the to_categorical method while testing an ANN.
$endgroup$
– Blenzus
yesterday

add a comment |

If you are working with Keras, you can use the to_categorical function to transform your mappings accordingly.

>>> from keras.utils import to_categorical

>>> y = [0,1,0,1,1]
>>> oh_y = to_categorial(y, num_classes=2)
>>> print(oh_y)

[[1,0],[0,1],[1,0],[0,1],[0,1]]
```

answered Apr 13 at 9:27

thanatoz

643421

If you are working with Keras, you can use the to_categorical function to transform your mappings accordingly.

>>> from keras.utils import to_categorical

>>> y = [0,1,0,1,1]
>>> oh_y = to_categorial(y, num_classes=2)
>>> print(oh_y)

[[1,0],[0,1],[1,0],[0,1],[0,1]]
```

answered Apr 13 at 9:27

thanatoz

643421

answered Apr 13 at 9:27

thanatoz

643421

answered Apr 13 at 9:27

thanatoz

643421

answered Apr 13 at 9:27

thanatoz

643421

$begingroup$
Thanks for the answer, no , actually i'm using "regular" classification algorithms , but yes i've used the to_categorical method while testing an ANN.
$endgroup$
– Blenzus
yesterday

add a comment |

$begingroup$
Thanks for the answer, no , actually i'm using "regular" classification algorithms , but yes i've used the to_categorical method while testing an ANN.
$endgroup$
– Blenzus
yesterday

Thanks for the answer, no , actually i'm using "regular" classification algorithms , but yes i've used the to_categorical method while testing an ANN.

– Blenzus
yesterday

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Data Science Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Ygtjki

5 Answers
5

Your Answer

Post as a guest

5 Answers
5

5 Answers
5

Post as a guest

Popular posts from this blog

Àrd-bhaile Cathair chruinne/Baile mòr cruinne | Artagailean ceangailte | Clàr-taice na seòladaireachd

5 Answers 5

Your Answer

Sign up or log in

Post as a guest

Post as a guest

5 Answers 5

5 Answers 5

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Àrd-bhaile Cathair chruinne/Baile mòr cruinne | Artagailean ceangailte | Clàr-taice na seòladaireachd

5 Answers
5

5 Answers
5

5 Answers
5