Is a distribution that is normal, but highly skewed, considered Gaussian? The Next CEO of Stack OverflowIs normality testing 'essentially useless'?What is the difference between zero-inflated and hurdle models?If my histogram shows a bell-shaped curve, can I say my data is normally distributed?How do I identify the “Long Tail” portion of my distribution?Skewed but bell-shaped still considered as normal distribution for ANOVA?Which Distribution Does the Data Point Belong to?Skewness - why is this distribution right skewed?log transform vs. resamplingIs it valid to remove the overhead of finding the current time for a computer program this way?Histograms for severely skewed dataWhat would the distribution of time spent per day on a given site look like?Distinguish between underlying Distribution and data shape in data transforming?Using bootstrap to estimate the 95th percentile and confidence interval for skewed data

Why did early computer designers eschew integers?

Is it a bad idea to plug the other end of ESD strap to wall ground?

Is it correct to say moon starry nights?

Find a path from s to t using as few red nodes as possible

Direct Implications Between USA and UK in Event of No-Deal Brexit

How to coordinate airplane tickets?

Is this a new Fibonacci Identity?

How can the PCs determine if an item is a phylactery?

Why did the Drakh emissary look so blurred in S04:E11 "Lines of Communication"?

Can Sri Krishna be called 'a person'?

Could you use a laser beam as a modulated carrier wave for radio signal?

"Eavesdropping" vs "Listen in on"

How seriously should I take size and weight limits of hand luggage?

Find the majority element, which appears more than half the time

Could a dragon use its wings to swim?

Planeswalker Ability and Death Timing

Why can't we say "I have been having a dog"?

Identify and count spells (Distinctive events within each group)

Is the 21st century's idea of "freedom of speech" based on precedent?

Is there a rule of thumb for determining the amount one should accept for a settlement offer?

Why was Sir Cadogan fired?

How to implement Comparable so it is consistent with identity-equality

How to show a landlord what we have in savings?

Do I need to write [sic] when including a quotation with a number less than 10 that isn't written out?



Is a distribution that is normal, but highly skewed, considered Gaussian?



The Next CEO of Stack OverflowIs normality testing 'essentially useless'?What is the difference between zero-inflated and hurdle models?If my histogram shows a bell-shaped curve, can I say my data is normally distributed?How do I identify the “Long Tail” portion of my distribution?Skewed but bell-shaped still considered as normal distribution for ANOVA?Which Distribution Does the Data Point Belong to?Skewness - why is this distribution right skewed?log transform vs. resamplingIs it valid to remove the overhead of finding the current time for a computer program this way?Histograms for severely skewed dataWhat would the distribution of time spent per day on a given site look like?Distinguish between underlying Distribution and data shape in data transforming?Using bootstrap to estimate the 95th percentile and confidence interval for skewed data










8












$begingroup$


I have this question: What do you think the distribution of time spent per day on YouTube looks like?



My answer is that it is probably normally distributed and highly left skewed. I expect there is one mode where most users spend around some average time and then a long right tail since some users are overwhelming power users.



Is that a fair answer? Is there a better word for that distribution?










share|cite|improve this question











$endgroup$







  • 3




    $begingroup$
    As some answers mention but do not emphasise, skewness is named informally for the longer tail if there is one, so right-skewed if a longer right tail. Left and right as used in this context both presuppose a display following a convention that magnitude is shown on the hoirizontal axis. If that sounds too obvious, consider displays in the Earth and environmental sciences in which the magnitude is height or depth and shown vertically. Small print: some measures of skewness can be zero even if a distribution is skewed geometrically.
    $endgroup$
    – Nick Cox
    yesterday











  • $begingroup$
    Total time per day for all users? or time per day per person? If the latter, then surely there's a moderately big spike at 0, in which case you probably need a 'spike and slab' style distribution with a Dirac delta at 0.
    $endgroup$
    – innisfree
    22 hours ago











  • $begingroup$
    "Normal" is synonymous with "Gaussian", and Gaussian distributions, also called normal distributions, are not skewed.
    $endgroup$
    – Michael Hardy
    3 hours ago















8












$begingroup$


I have this question: What do you think the distribution of time spent per day on YouTube looks like?



My answer is that it is probably normally distributed and highly left skewed. I expect there is one mode where most users spend around some average time and then a long right tail since some users are overwhelming power users.



Is that a fair answer? Is there a better word for that distribution?










share|cite|improve this question











$endgroup$







  • 3




    $begingroup$
    As some answers mention but do not emphasise, skewness is named informally for the longer tail if there is one, so right-skewed if a longer right tail. Left and right as used in this context both presuppose a display following a convention that magnitude is shown on the hoirizontal axis. If that sounds too obvious, consider displays in the Earth and environmental sciences in which the magnitude is height or depth and shown vertically. Small print: some measures of skewness can be zero even if a distribution is skewed geometrically.
    $endgroup$
    – Nick Cox
    yesterday











  • $begingroup$
    Total time per day for all users? or time per day per person? If the latter, then surely there's a moderately big spike at 0, in which case you probably need a 'spike and slab' style distribution with a Dirac delta at 0.
    $endgroup$
    – innisfree
    22 hours ago











  • $begingroup$
    "Normal" is synonymous with "Gaussian", and Gaussian distributions, also called normal distributions, are not skewed.
    $endgroup$
    – Michael Hardy
    3 hours ago













8












8








8


1



$begingroup$


I have this question: What do you think the distribution of time spent per day on YouTube looks like?



My answer is that it is probably normally distributed and highly left skewed. I expect there is one mode where most users spend around some average time and then a long right tail since some users are overwhelming power users.



Is that a fair answer? Is there a better word for that distribution?










share|cite|improve this question











$endgroup$




I have this question: What do you think the distribution of time spent per day on YouTube looks like?



My answer is that it is probably normally distributed and highly left skewed. I expect there is one mode where most users spend around some average time and then a long right tail since some users are overwhelming power users.



Is that a fair answer? Is there a better word for that distribution?







distributions normal-distribution skewness skew-normal






share|cite|improve this question















share|cite|improve this question













share|cite|improve this question




share|cite|improve this question








edited yesterday









Nick Cox

39.1k587131




39.1k587131










asked 2 days ago









CauderCauder

8317




8317







  • 3




    $begingroup$
    As some answers mention but do not emphasise, skewness is named informally for the longer tail if there is one, so right-skewed if a longer right tail. Left and right as used in this context both presuppose a display following a convention that magnitude is shown on the hoirizontal axis. If that sounds too obvious, consider displays in the Earth and environmental sciences in which the magnitude is height or depth and shown vertically. Small print: some measures of skewness can be zero even if a distribution is skewed geometrically.
    $endgroup$
    – Nick Cox
    yesterday











  • $begingroup$
    Total time per day for all users? or time per day per person? If the latter, then surely there's a moderately big spike at 0, in which case you probably need a 'spike and slab' style distribution with a Dirac delta at 0.
    $endgroup$
    – innisfree
    22 hours ago











  • $begingroup$
    "Normal" is synonymous with "Gaussian", and Gaussian distributions, also called normal distributions, are not skewed.
    $endgroup$
    – Michael Hardy
    3 hours ago












  • 3




    $begingroup$
    As some answers mention but do not emphasise, skewness is named informally for the longer tail if there is one, so right-skewed if a longer right tail. Left and right as used in this context both presuppose a display following a convention that magnitude is shown on the hoirizontal axis. If that sounds too obvious, consider displays in the Earth and environmental sciences in which the magnitude is height or depth and shown vertically. Small print: some measures of skewness can be zero even if a distribution is skewed geometrically.
    $endgroup$
    – Nick Cox
    yesterday











  • $begingroup$
    Total time per day for all users? or time per day per person? If the latter, then surely there's a moderately big spike at 0, in which case you probably need a 'spike and slab' style distribution with a Dirac delta at 0.
    $endgroup$
    – innisfree
    22 hours ago











  • $begingroup$
    "Normal" is synonymous with "Gaussian", and Gaussian distributions, also called normal distributions, are not skewed.
    $endgroup$
    – Michael Hardy
    3 hours ago







3




3




$begingroup$
As some answers mention but do not emphasise, skewness is named informally for the longer tail if there is one, so right-skewed if a longer right tail. Left and right as used in this context both presuppose a display following a convention that magnitude is shown on the hoirizontal axis. If that sounds too obvious, consider displays in the Earth and environmental sciences in which the magnitude is height or depth and shown vertically. Small print: some measures of skewness can be zero even if a distribution is skewed geometrically.
$endgroup$
– Nick Cox
yesterday





$begingroup$
As some answers mention but do not emphasise, skewness is named informally for the longer tail if there is one, so right-skewed if a longer right tail. Left and right as used in this context both presuppose a display following a convention that magnitude is shown on the hoirizontal axis. If that sounds too obvious, consider displays in the Earth and environmental sciences in which the magnitude is height or depth and shown vertically. Small print: some measures of skewness can be zero even if a distribution is skewed geometrically.
$endgroup$
– Nick Cox
yesterday













$begingroup$
Total time per day for all users? or time per day per person? If the latter, then surely there's a moderately big spike at 0, in which case you probably need a 'spike and slab' style distribution with a Dirac delta at 0.
$endgroup$
– innisfree
22 hours ago





$begingroup$
Total time per day for all users? or time per day per person? If the latter, then surely there's a moderately big spike at 0, in which case you probably need a 'spike and slab' style distribution with a Dirac delta at 0.
$endgroup$
– innisfree
22 hours ago













$begingroup$
"Normal" is synonymous with "Gaussian", and Gaussian distributions, also called normal distributions, are not skewed.
$endgroup$
– Michael Hardy
3 hours ago




$begingroup$
"Normal" is synonymous with "Gaussian", and Gaussian distributions, also called normal distributions, are not skewed.
$endgroup$
– Michael Hardy
3 hours ago










9 Answers
9






active

oldest

votes


















11












$begingroup$

A fraction per day is certainly not negative. This rules out the normal distribution, which has probability mass over the entire real axis - in particular over the negative half.



Power law distributions are often used to model things like income distributions, sizes of cities etc. They are nonnegative and typically highly skewed. These would be the first I would try in modeling time spent watching YouTube. (Or monitoring CrossValidated questions.)



More information on power laws can be found here or here, or in our power-law tag.






share|cite|improve this answer









$endgroup$








  • 12




    $begingroup$
    You're completely correct that normal distributions have support on the real line. And yet...they're no an awful model for some strictly positive qualities, like adults' height or weight, where the mean and variance are such that the negative values are very unlikely under the model.
    $endgroup$
    – Matt Krause
    2 days ago






  • 1




    $begingroup$
    @MattKrause That's actually a great question - is there a same probability I will be '10 cm above or below the mean height' or '10 percent above or below the mean height'? Only the first case could warrant normal distribution.
    $endgroup$
    – Tomáš Kafka
    17 hours ago










  • $begingroup$
    @MattKrause: I completely agree, in a general sense. Yet, the present question is about the proportion of daily time spent watching YouTube. We don't have any data, but I would be extremely surprised if the distribution was even remotely symmetric.
    $endgroup$
    – Stephan Kolassa
    14 hours ago


















35












$begingroup$

A distribution that is normal is not highly skewed. That is a contradiction. Normally distributed variables have skew = 0.






share|cite|improve this answer









$endgroup$








  • 1




    $begingroup$
    What is a better way to describe the distribution? Is there a word for that type of distribution where it centers around a mode and then has a long tail?
    $endgroup$
    – Cauder
    2 days ago






  • 10




    $begingroup$
    Unimodal and skewed is as close as I can come...
    $endgroup$
    – jbowman
    2 days ago






  • 8




    $begingroup$
    As an aside, it's just really incredible that people give their time to help other people get better at this stuff. I know it goes without saying, but it's so cool what you both do!
    $endgroup$
    – Cauder
    2 days ago






  • 5




    $begingroup$
    Yes, but it's worth clarifying that that statement pertains to the normally distributed population. A sample drawn from that population can be very skewed.
    $endgroup$
    – gung
    2 days ago










  • $begingroup$
    When the skew value is small ("small" being decided by the people dealing with the stats in question), you can still treat the population as normal, albeit with minor error as a result.
    $endgroup$
    – Carl Witthoft
    11 hours ago


















15












$begingroup$

If it has long right tail, then it's right skewed.



enter image description here



It can't be a normal distribution since skew !=0, it's perhaps a unimodal skew normal distribution:



https://en.wikipedia.org/wiki/Skew_normal_distribution






share|cite|improve this answer








New contributor




behold is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






$endgroup$




















    9












    $begingroup$

    It could be a log-normal distribution. As mentioned here:




    Users' dwell time on online articles (jokes, news etc.) follows a log-normal distribution.




    The reference given is: Yin, Peifeng; Luo, Ping; Lee, Wang-Chien; Wang, Min (2013). Silence is also evidence: interpreting dwell time for recommendation from psychological perspective. ACM International Conference on KDD.






    share|cite|improve this answer









    $endgroup$




















      5












      $begingroup$

      The gamma distribution could be a good candidate to describe this kind of distribution over nonnegative, right-skewed data. See the green line in the image here:
      https://en.m.wikipedia.org/wiki/Gamma_distribution






      share|cite|improve this answer









      $endgroup$




















        3












        $begingroup$

        "Normal" and "Gaussian" mean exactly the same thing. As other answers explain, the distribution you're talking about is not normal/Gaussian, because that distribution assigns probabilities to every value on the real line, whereas your distribution only exists between $0$ and $24$.






        share|cite|improve this answer









        $endgroup$




















          2












          $begingroup$

          In the case at hand, since the time spent per day is bound from $0$ to $1$ (if quantified as a fraction of the day), distributions that are unbounded above (e.g. Pareto, skew-normal, Gamma, log-normal) won't work, but Beta would.






          share|cite|improve this answer









          $endgroup$




















            2












            $begingroup$

            How about a hurdle model?



            A hurdle model has two parts. The first is Bernoulli experiment that determines whether you use YouTube at all. If you don't, then your usage time is obviously zero and you're done. If you do, you "pass that hurdle", then the usage time comes from some other strictly positive distribution.



            A closely related concept are zero-inflated models. These are meant to deal with a situation where we observe a bunch of zeros, but can't distinguish between always-zeros and sometimes-zeros. For example, consider the number of cigarettes that a person smokes each day. For non-smokers, that number is always zero, but some smokers may not smoke on a given day (out of cigarettes? on a long flight?). Unlike the hurdle model, the "smoker" distribution here should include zero, but these counts are 'inflated' by the non-smokers' contribution too.






            share|cite|improve this answer









            $endgroup$




















              1












              $begingroup$

              "Is there a better word for that distribution?"



              There's a worthwhile distinction here between using words to describe the properties of the distribution, versus trying to find a "name" for the distribution so that you can identify it as (approximately) an instance of a particular standard distribution: one for which a formula or statistical tables might exist for its distribution function, and for which you could estimate its parameters. In this latter case, you are likely using the named distribution, e.g. "normal/Gaussian" (the two terms are generally synonymous), as a model that captures some of the key features of your data, rather than claiming the population your data is drawn from exactly follows that theoretical distribution. To slightly misquote George Box, all models are "wrong", but some are useful. If you are thinking about the modelling approach, it is worth considering what features you want to incorporate and how complicated or parsimonious you want your model to be.



              Being positively skewed is an example of describing a property that the distribution has, but doesn't come close to specifying which off-the-shelf distribution is "the" appropriate model. It does rule out some candidates, for example the Gaussian (i.e. normal) distribution has zero skew so will not be appropriate to model your data if the skew is an important feature. There may be other properties of the data that are important to you too, e.g. that it's unimodal (has just one peak) or that it is bounded between 0 and 24 hours (or between 0 and 1, if you are writing it as a fraction of the day), or that there is a probability mass concentrated at zero (since there are people who do not watch youtube at all on a given day). You may also be interested in other properties like the kurtosis. And it is worth bearing in mind that even if your distribution had a "hump" or "bell-curve" shape and had zero or near-zero skew, it doesn't automatically follow that the normal distribution is "correct" for it! On the other hand, even if the population your data is drawn from actually did follow a particular distribution precisely, due to sampling error your dataset may not quite resemble it. Small data sets are likely to be "noisy", and it may be unclear whether certain features you can see, e.g. additional small humps or asymmetric tails, are properties of the underlying population the data was drawn from (and perhaps therefore ought to be incorporated in your model) or whether they are just artefacts from your particular sample (and for modelling purposes should be ignored). If you have a small data set and the skew is close to zero, then it is even plausible the underlying distribution is actually symmetric. The larger your data set and the larger the skewness, the less plausible this becomes — but while you could perform a significance test to see how convincing is the evidence your data provides for skewness in the population it was drawn from, this may be missing the point as to whether a normal (or other zero skew) distribution is appropriate as a model ...



              Which properties of the data really matter for the purposes you are intending to model it? Note that if the skew is reasonably small and you do not care very much about it, even if the underlying population is genuinely skewed, then you might still find the normal distribution a useful model to approximate this true distribution of watching times. But you should check that this doesn't end up making silly predictions. Because a normal distribution has no highest or lowest possible value, then although extremely high or low values become increasingly unlikely, you will always find that your model predicts there is some probability of watching for a negative number of hours per day, or more than 24 hours. This gets more problematic for you if the predicted probability of such impossible events becomes high. If the skewness is so noteworthy you want to capture it as part of your model, then the skew normal distribution may be more appropriate. If you want to capture both skewness and kurtosis, then consider the skewed t. If you want to incorporate the physically possible upper and lower bounds, then consider using the truncated versions of these distributions. Many other probability distributions exist that can be skewed and unimodal (for appropriate parameter choices) such as the F or gamma distributions, and again you can truncate these so they do not predict impossibly high watching times. A beta distribution may be a good choice if you are modelling the fraction of the day spent watching, as this is always bounded between 0 and 1 without further truncation being necessary. If you want to incorporate the concentration of probability at exactly zero due to non-watchers, then consider building in a hurdle model.



              But at the point you are trying to throw in every feature you can identify from your data, and build an ever more sophisticated model, perhaps you should ask yourself why you are doing this? Would there be an advantage to a simpler model, for example it being easier to work with mathematically or having fewer parameters to estimate? If you are concerned that such simplification will leave you unable to capture all of the properties of interest to you, it may well be that no "off-the-shelf" distribution does quite what you want. However, we are not restricted to working with named distributions whose mathematical properties have been elucidated previously. Instead, consider using your data to construct an empirical distribution function. This will capture all the behaviour that was present in your data, but you can no longer give it a name like "normal" or "gamma", nor can you apply mathematical properties that pertain only to a particular distribution. For instance, the "95% of the data lies within 1.96 standard deviations of the mean" rule is for normally distributed data and may not apply to your distribution; though note that some rules apply to all distributions, e.g. Chebyshev's inequality guarantees that at least 75% of your data must lie within two standard deviations of the mean, regardless of the skew. Unfortunately the empirical distribution will also inherit all of the properties of your data set that arose just by sampling error, not just those possessed by the underlying population, so you may find a histogram of your empirical distribution has some humps and dips that the population itself does not — if you want to improve matters, you can try taking a larger sample. You may want to investigate smoothed empirical distribution functions.



              In summary: although the normal distribution has zero skew, the fact your data are skewed doesn't rule out the normal distribution as a useful model, though it does suggest some other distribution may be more appropriate. You should also consider other properties of the data when choosing your model, besides the skew, and consider too the purposes you are going to use the model for. It's safe to say that your true population of watching times does not exactly follow some famous, named distribution, but this does not necessarily mean that such a distribution is doomed to be useless as a model. However, for some purposes you may prefer to just use the empirical distribution itself, rather than to fit a standard distribution to it.






              share|cite|improve this answer











              $endgroup$













                Your Answer





                StackExchange.ifUsing("editor", function ()
                return StackExchange.using("mathjaxEditing", function ()
                StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
                StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
                );
                );
                , "mathjax-editing");

                StackExchange.ready(function()
                var channelOptions =
                tags: "".split(" "),
                id: "65"
                ;
                initTagRenderer("".split(" "), "".split(" "), channelOptions);

                StackExchange.using("externalEditor", function()
                // Have to fire editor after snippets, if snippets enabled
                if (StackExchange.settings.snippets.snippetsEnabled)
                StackExchange.using("snippets", function()
                createEditor();
                );

                else
                createEditor();

                );

                function createEditor()
                StackExchange.prepareEditor(
                heartbeatType: 'answer',
                autoActivateHeartbeat: false,
                convertImagesToLinks: false,
                noModals: true,
                showLowRepImageUploadWarning: true,
                reputationToPostImages: null,
                bindNavPrevention: true,
                postfix: "",
                imageUploader:
                brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
                contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
                allowUrls: true
                ,
                onDemand: true,
                discardSelector: ".discard-answer"
                ,immediatelyShowMarkdownHelp:true
                );



                );













                draft saved

                draft discarded


















                StackExchange.ready(
                function ()
                StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f400316%2fis-a-distribution-that-is-normal-but-highly-skewed-considered-gaussian%23new-answer', 'question_page');

                );

                Post as a guest















                Required, but never shown

























                9 Answers
                9






                active

                oldest

                votes








                9 Answers
                9






                active

                oldest

                votes









                active

                oldest

                votes






                active

                oldest

                votes









                11












                $begingroup$

                A fraction per day is certainly not negative. This rules out the normal distribution, which has probability mass over the entire real axis - in particular over the negative half.



                Power law distributions are often used to model things like income distributions, sizes of cities etc. They are nonnegative and typically highly skewed. These would be the first I would try in modeling time spent watching YouTube. (Or monitoring CrossValidated questions.)



                More information on power laws can be found here or here, or in our power-law tag.






                share|cite|improve this answer









                $endgroup$








                • 12




                  $begingroup$
                  You're completely correct that normal distributions have support on the real line. And yet...they're no an awful model for some strictly positive qualities, like adults' height or weight, where the mean and variance are such that the negative values are very unlikely under the model.
                  $endgroup$
                  – Matt Krause
                  2 days ago






                • 1




                  $begingroup$
                  @MattKrause That's actually a great question - is there a same probability I will be '10 cm above or below the mean height' or '10 percent above or below the mean height'? Only the first case could warrant normal distribution.
                  $endgroup$
                  – Tomáš Kafka
                  17 hours ago










                • $begingroup$
                  @MattKrause: I completely agree, in a general sense. Yet, the present question is about the proportion of daily time spent watching YouTube. We don't have any data, but I would be extremely surprised if the distribution was even remotely symmetric.
                  $endgroup$
                  – Stephan Kolassa
                  14 hours ago















                11












                $begingroup$

                A fraction per day is certainly not negative. This rules out the normal distribution, which has probability mass over the entire real axis - in particular over the negative half.



                Power law distributions are often used to model things like income distributions, sizes of cities etc. They are nonnegative and typically highly skewed. These would be the first I would try in modeling time spent watching YouTube. (Or monitoring CrossValidated questions.)



                More information on power laws can be found here or here, or in our power-law tag.






                share|cite|improve this answer









                $endgroup$








                • 12




                  $begingroup$
                  You're completely correct that normal distributions have support on the real line. And yet...they're no an awful model for some strictly positive qualities, like adults' height or weight, where the mean and variance are such that the negative values are very unlikely under the model.
                  $endgroup$
                  – Matt Krause
                  2 days ago






                • 1




                  $begingroup$
                  @MattKrause That's actually a great question - is there a same probability I will be '10 cm above or below the mean height' or '10 percent above or below the mean height'? Only the first case could warrant normal distribution.
                  $endgroup$
                  – Tomáš Kafka
                  17 hours ago










                • $begingroup$
                  @MattKrause: I completely agree, in a general sense. Yet, the present question is about the proportion of daily time spent watching YouTube. We don't have any data, but I would be extremely surprised if the distribution was even remotely symmetric.
                  $endgroup$
                  – Stephan Kolassa
                  14 hours ago













                11












                11








                11





                $begingroup$

                A fraction per day is certainly not negative. This rules out the normal distribution, which has probability mass over the entire real axis - in particular over the negative half.



                Power law distributions are often used to model things like income distributions, sizes of cities etc. They are nonnegative and typically highly skewed. These would be the first I would try in modeling time spent watching YouTube. (Or monitoring CrossValidated questions.)



                More information on power laws can be found here or here, or in our power-law tag.






                share|cite|improve this answer









                $endgroup$



                A fraction per day is certainly not negative. This rules out the normal distribution, which has probability mass over the entire real axis - in particular over the negative half.



                Power law distributions are often used to model things like income distributions, sizes of cities etc. They are nonnegative and typically highly skewed. These would be the first I would try in modeling time spent watching YouTube. (Or monitoring CrossValidated questions.)



                More information on power laws can be found here or here, or in our power-law tag.







                share|cite|improve this answer












                share|cite|improve this answer



                share|cite|improve this answer










                answered 2 days ago









                Stephan KolassaStephan Kolassa

                47.3k7100176




                47.3k7100176







                • 12




                  $begingroup$
                  You're completely correct that normal distributions have support on the real line. And yet...they're no an awful model for some strictly positive qualities, like adults' height or weight, where the mean and variance are such that the negative values are very unlikely under the model.
                  $endgroup$
                  – Matt Krause
                  2 days ago






                • 1




                  $begingroup$
                  @MattKrause That's actually a great question - is there a same probability I will be '10 cm above or below the mean height' or '10 percent above or below the mean height'? Only the first case could warrant normal distribution.
                  $endgroup$
                  – Tomáš Kafka
                  17 hours ago










                • $begingroup$
                  @MattKrause: I completely agree, in a general sense. Yet, the present question is about the proportion of daily time spent watching YouTube. We don't have any data, but I would be extremely surprised if the distribution was even remotely symmetric.
                  $endgroup$
                  – Stephan Kolassa
                  14 hours ago












                • 12




                  $begingroup$
                  You're completely correct that normal distributions have support on the real line. And yet...they're no an awful model for some strictly positive qualities, like adults' height or weight, where the mean and variance are such that the negative values are very unlikely under the model.
                  $endgroup$
                  – Matt Krause
                  2 days ago






                • 1




                  $begingroup$
                  @MattKrause That's actually a great question - is there a same probability I will be '10 cm above or below the mean height' or '10 percent above or below the mean height'? Only the first case could warrant normal distribution.
                  $endgroup$
                  – Tomáš Kafka
                  17 hours ago










                • $begingroup$
                  @MattKrause: I completely agree, in a general sense. Yet, the present question is about the proportion of daily time spent watching YouTube. We don't have any data, but I would be extremely surprised if the distribution was even remotely symmetric.
                  $endgroup$
                  – Stephan Kolassa
                  14 hours ago







                12




                12




                $begingroup$
                You're completely correct that normal distributions have support on the real line. And yet...they're no an awful model for some strictly positive qualities, like adults' height or weight, where the mean and variance are such that the negative values are very unlikely under the model.
                $endgroup$
                – Matt Krause
                2 days ago




                $begingroup$
                You're completely correct that normal distributions have support on the real line. And yet...they're no an awful model for some strictly positive qualities, like adults' height or weight, where the mean and variance are such that the negative values are very unlikely under the model.
                $endgroup$
                – Matt Krause
                2 days ago




                1




                1




                $begingroup$
                @MattKrause That's actually a great question - is there a same probability I will be '10 cm above or below the mean height' or '10 percent above or below the mean height'? Only the first case could warrant normal distribution.
                $endgroup$
                – Tomáš Kafka
                17 hours ago




                $begingroup$
                @MattKrause That's actually a great question - is there a same probability I will be '10 cm above or below the mean height' or '10 percent above or below the mean height'? Only the first case could warrant normal distribution.
                $endgroup$
                – Tomáš Kafka
                17 hours ago












                $begingroup$
                @MattKrause: I completely agree, in a general sense. Yet, the present question is about the proportion of daily time spent watching YouTube. We don't have any data, but I would be extremely surprised if the distribution was even remotely symmetric.
                $endgroup$
                – Stephan Kolassa
                14 hours ago




                $begingroup$
                @MattKrause: I completely agree, in a general sense. Yet, the present question is about the proportion of daily time spent watching YouTube. We don't have any data, but I would be extremely surprised if the distribution was even remotely symmetric.
                $endgroup$
                – Stephan Kolassa
                14 hours ago













                35












                $begingroup$

                A distribution that is normal is not highly skewed. That is a contradiction. Normally distributed variables have skew = 0.






                share|cite|improve this answer









                $endgroup$








                • 1




                  $begingroup$
                  What is a better way to describe the distribution? Is there a word for that type of distribution where it centers around a mode and then has a long tail?
                  $endgroup$
                  – Cauder
                  2 days ago






                • 10




                  $begingroup$
                  Unimodal and skewed is as close as I can come...
                  $endgroup$
                  – jbowman
                  2 days ago






                • 8




                  $begingroup$
                  As an aside, it's just really incredible that people give their time to help other people get better at this stuff. I know it goes without saying, but it's so cool what you both do!
                  $endgroup$
                  – Cauder
                  2 days ago






                • 5




                  $begingroup$
                  Yes, but it's worth clarifying that that statement pertains to the normally distributed population. A sample drawn from that population can be very skewed.
                  $endgroup$
                  – gung
                  2 days ago










                • $begingroup$
                  When the skew value is small ("small" being decided by the people dealing with the stats in question), you can still treat the population as normal, albeit with minor error as a result.
                  $endgroup$
                  – Carl Witthoft
                  11 hours ago















                35












                $begingroup$

                A distribution that is normal is not highly skewed. That is a contradiction. Normally distributed variables have skew = 0.






                share|cite|improve this answer









                $endgroup$








                • 1




                  $begingroup$
                  What is a better way to describe the distribution? Is there a word for that type of distribution where it centers around a mode and then has a long tail?
                  $endgroup$
                  – Cauder
                  2 days ago






                • 10




                  $begingroup$
                  Unimodal and skewed is as close as I can come...
                  $endgroup$
                  – jbowman
                  2 days ago






                • 8




                  $begingroup$
                  As an aside, it's just really incredible that people give their time to help other people get better at this stuff. I know it goes without saying, but it's so cool what you both do!
                  $endgroup$
                  – Cauder
                  2 days ago






                • 5




                  $begingroup$
                  Yes, but it's worth clarifying that that statement pertains to the normally distributed population. A sample drawn from that population can be very skewed.
                  $endgroup$
                  – gung
                  2 days ago










                • $begingroup$
                  When the skew value is small ("small" being decided by the people dealing with the stats in question), you can still treat the population as normal, albeit with minor error as a result.
                  $endgroup$
                  – Carl Witthoft
                  11 hours ago













                35












                35








                35





                $begingroup$

                A distribution that is normal is not highly skewed. That is a contradiction. Normally distributed variables have skew = 0.






                share|cite|improve this answer









                $endgroup$



                A distribution that is normal is not highly skewed. That is a contradiction. Normally distributed variables have skew = 0.







                share|cite|improve this answer












                share|cite|improve this answer



                share|cite|improve this answer










                answered 2 days ago









                Peter FlomPeter Flom

                76.8k12109214




                76.8k12109214







                • 1




                  $begingroup$
                  What is a better way to describe the distribution? Is there a word for that type of distribution where it centers around a mode and then has a long tail?
                  $endgroup$
                  – Cauder
                  2 days ago






                • 10




                  $begingroup$
                  Unimodal and skewed is as close as I can come...
                  $endgroup$
                  – jbowman
                  2 days ago






                • 8




                  $begingroup$
                  As an aside, it's just really incredible that people give their time to help other people get better at this stuff. I know it goes without saying, but it's so cool what you both do!
                  $endgroup$
                  – Cauder
                  2 days ago






                • 5




                  $begingroup$
                  Yes, but it's worth clarifying that that statement pertains to the normally distributed population. A sample drawn from that population can be very skewed.
                  $endgroup$
                  – gung
                  2 days ago










                • $begingroup$
                  When the skew value is small ("small" being decided by the people dealing with the stats in question), you can still treat the population as normal, albeit with minor error as a result.
                  $endgroup$
                  – Carl Witthoft
                  11 hours ago












                • 1




                  $begingroup$
                  What is a better way to describe the distribution? Is there a word for that type of distribution where it centers around a mode and then has a long tail?
                  $endgroup$
                  – Cauder
                  2 days ago






                • 10




                  $begingroup$
                  Unimodal and skewed is as close as I can come...
                  $endgroup$
                  – jbowman
                  2 days ago






                • 8




                  $begingroup$
                  As an aside, it's just really incredible that people give their time to help other people get better at this stuff. I know it goes without saying, but it's so cool what you both do!
                  $endgroup$
                  – Cauder
                  2 days ago






                • 5




                  $begingroup$
                  Yes, but it's worth clarifying that that statement pertains to the normally distributed population. A sample drawn from that population can be very skewed.
                  $endgroup$
                  – gung
                  2 days ago










                • $begingroup$
                  When the skew value is small ("small" being decided by the people dealing with the stats in question), you can still treat the population as normal, albeit with minor error as a result.
                  $endgroup$
                  – Carl Witthoft
                  11 hours ago







                1




                1




                $begingroup$
                What is a better way to describe the distribution? Is there a word for that type of distribution where it centers around a mode and then has a long tail?
                $endgroup$
                – Cauder
                2 days ago




                $begingroup$
                What is a better way to describe the distribution? Is there a word for that type of distribution where it centers around a mode and then has a long tail?
                $endgroup$
                – Cauder
                2 days ago




                10




                10




                $begingroup$
                Unimodal and skewed is as close as I can come...
                $endgroup$
                – jbowman
                2 days ago




                $begingroup$
                Unimodal and skewed is as close as I can come...
                $endgroup$
                – jbowman
                2 days ago




                8




                8




                $begingroup$
                As an aside, it's just really incredible that people give their time to help other people get better at this stuff. I know it goes without saying, but it's so cool what you both do!
                $endgroup$
                – Cauder
                2 days ago




                $begingroup$
                As an aside, it's just really incredible that people give their time to help other people get better at this stuff. I know it goes without saying, but it's so cool what you both do!
                $endgroup$
                – Cauder
                2 days ago




                5




                5




                $begingroup$
                Yes, but it's worth clarifying that that statement pertains to the normally distributed population. A sample drawn from that population can be very skewed.
                $endgroup$
                – gung
                2 days ago




                $begingroup$
                Yes, but it's worth clarifying that that statement pertains to the normally distributed population. A sample drawn from that population can be very skewed.
                $endgroup$
                – gung
                2 days ago












                $begingroup$
                When the skew value is small ("small" being decided by the people dealing with the stats in question), you can still treat the population as normal, albeit with minor error as a result.
                $endgroup$
                – Carl Witthoft
                11 hours ago




                $begingroup$
                When the skew value is small ("small" being decided by the people dealing with the stats in question), you can still treat the population as normal, albeit with minor error as a result.
                $endgroup$
                – Carl Witthoft
                11 hours ago











                15












                $begingroup$

                If it has long right tail, then it's right skewed.



                enter image description here



                It can't be a normal distribution since skew !=0, it's perhaps a unimodal skew normal distribution:



                https://en.wikipedia.org/wiki/Skew_normal_distribution






                share|cite|improve this answer








                New contributor




                behold is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                Check out our Code of Conduct.






                $endgroup$

















                  15












                  $begingroup$

                  If it has long right tail, then it's right skewed.



                  enter image description here



                  It can't be a normal distribution since skew !=0, it's perhaps a unimodal skew normal distribution:



                  https://en.wikipedia.org/wiki/Skew_normal_distribution






                  share|cite|improve this answer








                  New contributor




                  behold is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.






                  $endgroup$















                    15












                    15








                    15





                    $begingroup$

                    If it has long right tail, then it's right skewed.



                    enter image description here



                    It can't be a normal distribution since skew !=0, it's perhaps a unimodal skew normal distribution:



                    https://en.wikipedia.org/wiki/Skew_normal_distribution






                    share|cite|improve this answer








                    New contributor




                    behold is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                    Check out our Code of Conduct.






                    $endgroup$



                    If it has long right tail, then it's right skewed.



                    enter image description here



                    It can't be a normal distribution since skew !=0, it's perhaps a unimodal skew normal distribution:



                    https://en.wikipedia.org/wiki/Skew_normal_distribution







                    share|cite|improve this answer








                    New contributor




                    behold is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                    Check out our Code of Conduct.









                    share|cite|improve this answer



                    share|cite|improve this answer






                    New contributor




                    behold is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                    Check out our Code of Conduct.









                    answered 2 days ago









                    beholdbehold

                    1757




                    1757




                    New contributor




                    behold is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                    Check out our Code of Conduct.





                    New contributor





                    behold is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                    Check out our Code of Conduct.






                    behold is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                    Check out our Code of Conduct.





















                        9












                        $begingroup$

                        It could be a log-normal distribution. As mentioned here:




                        Users' dwell time on online articles (jokes, news etc.) follows a log-normal distribution.




                        The reference given is: Yin, Peifeng; Luo, Ping; Lee, Wang-Chien; Wang, Min (2013). Silence is also evidence: interpreting dwell time for recommendation from psychological perspective. ACM International Conference on KDD.






                        share|cite|improve this answer









                        $endgroup$

















                          9












                          $begingroup$

                          It could be a log-normal distribution. As mentioned here:




                          Users' dwell time on online articles (jokes, news etc.) follows a log-normal distribution.




                          The reference given is: Yin, Peifeng; Luo, Ping; Lee, Wang-Chien; Wang, Min (2013). Silence is also evidence: interpreting dwell time for recommendation from psychological perspective. ACM International Conference on KDD.






                          share|cite|improve this answer









                          $endgroup$















                            9












                            9








                            9





                            $begingroup$

                            It could be a log-normal distribution. As mentioned here:




                            Users' dwell time on online articles (jokes, news etc.) follows a log-normal distribution.




                            The reference given is: Yin, Peifeng; Luo, Ping; Lee, Wang-Chien; Wang, Min (2013). Silence is also evidence: interpreting dwell time for recommendation from psychological perspective. ACM International Conference on KDD.






                            share|cite|improve this answer









                            $endgroup$



                            It could be a log-normal distribution. As mentioned here:




                            Users' dwell time on online articles (jokes, news etc.) follows a log-normal distribution.




                            The reference given is: Yin, Peifeng; Luo, Ping; Lee, Wang-Chien; Wang, Min (2013). Silence is also evidence: interpreting dwell time for recommendation from psychological perspective. ACM International Conference on KDD.







                            share|cite|improve this answer












                            share|cite|improve this answer



                            share|cite|improve this answer










                            answered 2 days ago









                            Count IblisCount Iblis

                            24113




                            24113





















                                5












                                $begingroup$

                                The gamma distribution could be a good candidate to describe this kind of distribution over nonnegative, right-skewed data. See the green line in the image here:
                                https://en.m.wikipedia.org/wiki/Gamma_distribution






                                share|cite|improve this answer









                                $endgroup$

















                                  5












                                  $begingroup$

                                  The gamma distribution could be a good candidate to describe this kind of distribution over nonnegative, right-skewed data. See the green line in the image here:
                                  https://en.m.wikipedia.org/wiki/Gamma_distribution






                                  share|cite|improve this answer









                                  $endgroup$















                                    5












                                    5








                                    5





                                    $begingroup$

                                    The gamma distribution could be a good candidate to describe this kind of distribution over nonnegative, right-skewed data. See the green line in the image here:
                                    https://en.m.wikipedia.org/wiki/Gamma_distribution






                                    share|cite|improve this answer









                                    $endgroup$



                                    The gamma distribution could be a good candidate to describe this kind of distribution over nonnegative, right-skewed data. See the green line in the image here:
                                    https://en.m.wikipedia.org/wiki/Gamma_distribution







                                    share|cite|improve this answer












                                    share|cite|improve this answer



                                    share|cite|improve this answer










                                    answered yesterday









                                    mauricemaurice

                                    18816




                                    18816





















                                        3












                                        $begingroup$

                                        "Normal" and "Gaussian" mean exactly the same thing. As other answers explain, the distribution you're talking about is not normal/Gaussian, because that distribution assigns probabilities to every value on the real line, whereas your distribution only exists between $0$ and $24$.






                                        share|cite|improve this answer









                                        $endgroup$

















                                          3












                                          $begingroup$

                                          "Normal" and "Gaussian" mean exactly the same thing. As other answers explain, the distribution you're talking about is not normal/Gaussian, because that distribution assigns probabilities to every value on the real line, whereas your distribution only exists between $0$ and $24$.






                                          share|cite|improve this answer









                                          $endgroup$















                                            3












                                            3








                                            3





                                            $begingroup$

                                            "Normal" and "Gaussian" mean exactly the same thing. As other answers explain, the distribution you're talking about is not normal/Gaussian, because that distribution assigns probabilities to every value on the real line, whereas your distribution only exists between $0$ and $24$.






                                            share|cite|improve this answer









                                            $endgroup$



                                            "Normal" and "Gaussian" mean exactly the same thing. As other answers explain, the distribution you're talking about is not normal/Gaussian, because that distribution assigns probabilities to every value on the real line, whereas your distribution only exists between $0$ and $24$.







                                            share|cite|improve this answer












                                            share|cite|improve this answer



                                            share|cite|improve this answer










                                            answered 15 hours ago









                                            David RicherbyDavid Richerby

                                            1455




                                            1455





















                                                2












                                                $begingroup$

                                                In the case at hand, since the time spent per day is bound from $0$ to $1$ (if quantified as a fraction of the day), distributions that are unbounded above (e.g. Pareto, skew-normal, Gamma, log-normal) won't work, but Beta would.






                                                share|cite|improve this answer









                                                $endgroup$

















                                                  2












                                                  $begingroup$

                                                  In the case at hand, since the time spent per day is bound from $0$ to $1$ (if quantified as a fraction of the day), distributions that are unbounded above (e.g. Pareto, skew-normal, Gamma, log-normal) won't work, but Beta would.






                                                  share|cite|improve this answer









                                                  $endgroup$















                                                    2












                                                    2








                                                    2





                                                    $begingroup$

                                                    In the case at hand, since the time spent per day is bound from $0$ to $1$ (if quantified as a fraction of the day), distributions that are unbounded above (e.g. Pareto, skew-normal, Gamma, log-normal) won't work, but Beta would.






                                                    share|cite|improve this answer









                                                    $endgroup$



                                                    In the case at hand, since the time spent per day is bound from $0$ to $1$ (if quantified as a fraction of the day), distributions that are unbounded above (e.g. Pareto, skew-normal, Gamma, log-normal) won't work, but Beta would.







                                                    share|cite|improve this answer












                                                    share|cite|improve this answer



                                                    share|cite|improve this answer










                                                    answered 18 hours ago









                                                    J.G.J.G.

                                                    26616




                                                    26616





















                                                        2












                                                        $begingroup$

                                                        How about a hurdle model?



                                                        A hurdle model has two parts. The first is Bernoulli experiment that determines whether you use YouTube at all. If you don't, then your usage time is obviously zero and you're done. If you do, you "pass that hurdle", then the usage time comes from some other strictly positive distribution.



                                                        A closely related concept are zero-inflated models. These are meant to deal with a situation where we observe a bunch of zeros, but can't distinguish between always-zeros and sometimes-zeros. For example, consider the number of cigarettes that a person smokes each day. For non-smokers, that number is always zero, but some smokers may not smoke on a given day (out of cigarettes? on a long flight?). Unlike the hurdle model, the "smoker" distribution here should include zero, but these counts are 'inflated' by the non-smokers' contribution too.






                                                        share|cite|improve this answer









                                                        $endgroup$

















                                                          2












                                                          $begingroup$

                                                          How about a hurdle model?



                                                          A hurdle model has two parts. The first is Bernoulli experiment that determines whether you use YouTube at all. If you don't, then your usage time is obviously zero and you're done. If you do, you "pass that hurdle", then the usage time comes from some other strictly positive distribution.



                                                          A closely related concept are zero-inflated models. These are meant to deal with a situation where we observe a bunch of zeros, but can't distinguish between always-zeros and sometimes-zeros. For example, consider the number of cigarettes that a person smokes each day. For non-smokers, that number is always zero, but some smokers may not smoke on a given day (out of cigarettes? on a long flight?). Unlike the hurdle model, the "smoker" distribution here should include zero, but these counts are 'inflated' by the non-smokers' contribution too.






                                                          share|cite|improve this answer









                                                          $endgroup$















                                                            2












                                                            2








                                                            2





                                                            $begingroup$

                                                            How about a hurdle model?



                                                            A hurdle model has two parts. The first is Bernoulli experiment that determines whether you use YouTube at all. If you don't, then your usage time is obviously zero and you're done. If you do, you "pass that hurdle", then the usage time comes from some other strictly positive distribution.



                                                            A closely related concept are zero-inflated models. These are meant to deal with a situation where we observe a bunch of zeros, but can't distinguish between always-zeros and sometimes-zeros. For example, consider the number of cigarettes that a person smokes each day. For non-smokers, that number is always zero, but some smokers may not smoke on a given day (out of cigarettes? on a long flight?). Unlike the hurdle model, the "smoker" distribution here should include zero, but these counts are 'inflated' by the non-smokers' contribution too.






                                                            share|cite|improve this answer









                                                            $endgroup$



                                                            How about a hurdle model?



                                                            A hurdle model has two parts. The first is Bernoulli experiment that determines whether you use YouTube at all. If you don't, then your usage time is obviously zero and you're done. If you do, you "pass that hurdle", then the usage time comes from some other strictly positive distribution.



                                                            A closely related concept are zero-inflated models. These are meant to deal with a situation where we observe a bunch of zeros, but can't distinguish between always-zeros and sometimes-zeros. For example, consider the number of cigarettes that a person smokes each day. For non-smokers, that number is always zero, but some smokers may not smoke on a given day (out of cigarettes? on a long flight?). Unlike the hurdle model, the "smoker" distribution here should include zero, but these counts are 'inflated' by the non-smokers' contribution too.







                                                            share|cite|improve this answer












                                                            share|cite|improve this answer



                                                            share|cite|improve this answer










                                                            answered 15 hours ago









                                                            Matt KrauseMatt Krause

                                                            15k24380




                                                            15k24380





















                                                                1












                                                                $begingroup$

                                                                "Is there a better word for that distribution?"



                                                                There's a worthwhile distinction here between using words to describe the properties of the distribution, versus trying to find a "name" for the distribution so that you can identify it as (approximately) an instance of a particular standard distribution: one for which a formula or statistical tables might exist for its distribution function, and for which you could estimate its parameters. In this latter case, you are likely using the named distribution, e.g. "normal/Gaussian" (the two terms are generally synonymous), as a model that captures some of the key features of your data, rather than claiming the population your data is drawn from exactly follows that theoretical distribution. To slightly misquote George Box, all models are "wrong", but some are useful. If you are thinking about the modelling approach, it is worth considering what features you want to incorporate and how complicated or parsimonious you want your model to be.



                                                                Being positively skewed is an example of describing a property that the distribution has, but doesn't come close to specifying which off-the-shelf distribution is "the" appropriate model. It does rule out some candidates, for example the Gaussian (i.e. normal) distribution has zero skew so will not be appropriate to model your data if the skew is an important feature. There may be other properties of the data that are important to you too, e.g. that it's unimodal (has just one peak) or that it is bounded between 0 and 24 hours (or between 0 and 1, if you are writing it as a fraction of the day), or that there is a probability mass concentrated at zero (since there are people who do not watch youtube at all on a given day). You may also be interested in other properties like the kurtosis. And it is worth bearing in mind that even if your distribution had a "hump" or "bell-curve" shape and had zero or near-zero skew, it doesn't automatically follow that the normal distribution is "correct" for it! On the other hand, even if the population your data is drawn from actually did follow a particular distribution precisely, due to sampling error your dataset may not quite resemble it. Small data sets are likely to be "noisy", and it may be unclear whether certain features you can see, e.g. additional small humps or asymmetric tails, are properties of the underlying population the data was drawn from (and perhaps therefore ought to be incorporated in your model) or whether they are just artefacts from your particular sample (and for modelling purposes should be ignored). If you have a small data set and the skew is close to zero, then it is even plausible the underlying distribution is actually symmetric. The larger your data set and the larger the skewness, the less plausible this becomes — but while you could perform a significance test to see how convincing is the evidence your data provides for skewness in the population it was drawn from, this may be missing the point as to whether a normal (or other zero skew) distribution is appropriate as a model ...



                                                                Which properties of the data really matter for the purposes you are intending to model it? Note that if the skew is reasonably small and you do not care very much about it, even if the underlying population is genuinely skewed, then you might still find the normal distribution a useful model to approximate this true distribution of watching times. But you should check that this doesn't end up making silly predictions. Because a normal distribution has no highest or lowest possible value, then although extremely high or low values become increasingly unlikely, you will always find that your model predicts there is some probability of watching for a negative number of hours per day, or more than 24 hours. This gets more problematic for you if the predicted probability of such impossible events becomes high. If the skewness is so noteworthy you want to capture it as part of your model, then the skew normal distribution may be more appropriate. If you want to capture both skewness and kurtosis, then consider the skewed t. If you want to incorporate the physically possible upper and lower bounds, then consider using the truncated versions of these distributions. Many other probability distributions exist that can be skewed and unimodal (for appropriate parameter choices) such as the F or gamma distributions, and again you can truncate these so they do not predict impossibly high watching times. A beta distribution may be a good choice if you are modelling the fraction of the day spent watching, as this is always bounded between 0 and 1 without further truncation being necessary. If you want to incorporate the concentration of probability at exactly zero due to non-watchers, then consider building in a hurdle model.



                                                                But at the point you are trying to throw in every feature you can identify from your data, and build an ever more sophisticated model, perhaps you should ask yourself why you are doing this? Would there be an advantage to a simpler model, for example it being easier to work with mathematically or having fewer parameters to estimate? If you are concerned that such simplification will leave you unable to capture all of the properties of interest to you, it may well be that no "off-the-shelf" distribution does quite what you want. However, we are not restricted to working with named distributions whose mathematical properties have been elucidated previously. Instead, consider using your data to construct an empirical distribution function. This will capture all the behaviour that was present in your data, but you can no longer give it a name like "normal" or "gamma", nor can you apply mathematical properties that pertain only to a particular distribution. For instance, the "95% of the data lies within 1.96 standard deviations of the mean" rule is for normally distributed data and may not apply to your distribution; though note that some rules apply to all distributions, e.g. Chebyshev's inequality guarantees that at least 75% of your data must lie within two standard deviations of the mean, regardless of the skew. Unfortunately the empirical distribution will also inherit all of the properties of your data set that arose just by sampling error, not just those possessed by the underlying population, so you may find a histogram of your empirical distribution has some humps and dips that the population itself does not — if you want to improve matters, you can try taking a larger sample. You may want to investigate smoothed empirical distribution functions.



                                                                In summary: although the normal distribution has zero skew, the fact your data are skewed doesn't rule out the normal distribution as a useful model, though it does suggest some other distribution may be more appropriate. You should also consider other properties of the data when choosing your model, besides the skew, and consider too the purposes you are going to use the model for. It's safe to say that your true population of watching times does not exactly follow some famous, named distribution, but this does not necessarily mean that such a distribution is doomed to be useless as a model. However, for some purposes you may prefer to just use the empirical distribution itself, rather than to fit a standard distribution to it.






                                                                share|cite|improve this answer











                                                                $endgroup$

















                                                                  1












                                                                  $begingroup$

                                                                  "Is there a better word for that distribution?"



                                                                  There's a worthwhile distinction here between using words to describe the properties of the distribution, versus trying to find a "name" for the distribution so that you can identify it as (approximately) an instance of a particular standard distribution: one for which a formula or statistical tables might exist for its distribution function, and for which you could estimate its parameters. In this latter case, you are likely using the named distribution, e.g. "normal/Gaussian" (the two terms are generally synonymous), as a model that captures some of the key features of your data, rather than claiming the population your data is drawn from exactly follows that theoretical distribution. To slightly misquote George Box, all models are "wrong", but some are useful. If you are thinking about the modelling approach, it is worth considering what features you want to incorporate and how complicated or parsimonious you want your model to be.



                                                                  Being positively skewed is an example of describing a property that the distribution has, but doesn't come close to specifying which off-the-shelf distribution is "the" appropriate model. It does rule out some candidates, for example the Gaussian (i.e. normal) distribution has zero skew so will not be appropriate to model your data if the skew is an important feature. There may be other properties of the data that are important to you too, e.g. that it's unimodal (has just one peak) or that it is bounded between 0 and 24 hours (or between 0 and 1, if you are writing it as a fraction of the day), or that there is a probability mass concentrated at zero (since there are people who do not watch youtube at all on a given day). You may also be interested in other properties like the kurtosis. And it is worth bearing in mind that even if your distribution had a "hump" or "bell-curve" shape and had zero or near-zero skew, it doesn't automatically follow that the normal distribution is "correct" for it! On the other hand, even if the population your data is drawn from actually did follow a particular distribution precisely, due to sampling error your dataset may not quite resemble it. Small data sets are likely to be "noisy", and it may be unclear whether certain features you can see, e.g. additional small humps or asymmetric tails, are properties of the underlying population the data was drawn from (and perhaps therefore ought to be incorporated in your model) or whether they are just artefacts from your particular sample (and for modelling purposes should be ignored). If you have a small data set and the skew is close to zero, then it is even plausible the underlying distribution is actually symmetric. The larger your data set and the larger the skewness, the less plausible this becomes — but while you could perform a significance test to see how convincing is the evidence your data provides for skewness in the population it was drawn from, this may be missing the point as to whether a normal (or other zero skew) distribution is appropriate as a model ...



                                                                  Which properties of the data really matter for the purposes you are intending to model it? Note that if the skew is reasonably small and you do not care very much about it, even if the underlying population is genuinely skewed, then you might still find the normal distribution a useful model to approximate this true distribution of watching times. But you should check that this doesn't end up making silly predictions. Because a normal distribution has no highest or lowest possible value, then although extremely high or low values become increasingly unlikely, you will always find that your model predicts there is some probability of watching for a negative number of hours per day, or more than 24 hours. This gets more problematic for you if the predicted probability of such impossible events becomes high. If the skewness is so noteworthy you want to capture it as part of your model, then the skew normal distribution may be more appropriate. If you want to capture both skewness and kurtosis, then consider the skewed t. If you want to incorporate the physically possible upper and lower bounds, then consider using the truncated versions of these distributions. Many other probability distributions exist that can be skewed and unimodal (for appropriate parameter choices) such as the F or gamma distributions, and again you can truncate these so they do not predict impossibly high watching times. A beta distribution may be a good choice if you are modelling the fraction of the day spent watching, as this is always bounded between 0 and 1 without further truncation being necessary. If you want to incorporate the concentration of probability at exactly zero due to non-watchers, then consider building in a hurdle model.



                                                                  But at the point you are trying to throw in every feature you can identify from your data, and build an ever more sophisticated model, perhaps you should ask yourself why you are doing this? Would there be an advantage to a simpler model, for example it being easier to work with mathematically or having fewer parameters to estimate? If you are concerned that such simplification will leave you unable to capture all of the properties of interest to you, it may well be that no "off-the-shelf" distribution does quite what you want. However, we are not restricted to working with named distributions whose mathematical properties have been elucidated previously. Instead, consider using your data to construct an empirical distribution function. This will capture all the behaviour that was present in your data, but you can no longer give it a name like "normal" or "gamma", nor can you apply mathematical properties that pertain only to a particular distribution. For instance, the "95% of the data lies within 1.96 standard deviations of the mean" rule is for normally distributed data and may not apply to your distribution; though note that some rules apply to all distributions, e.g. Chebyshev's inequality guarantees that at least 75% of your data must lie within two standard deviations of the mean, regardless of the skew. Unfortunately the empirical distribution will also inherit all of the properties of your data set that arose just by sampling error, not just those possessed by the underlying population, so you may find a histogram of your empirical distribution has some humps and dips that the population itself does not — if you want to improve matters, you can try taking a larger sample. You may want to investigate smoothed empirical distribution functions.



                                                                  In summary: although the normal distribution has zero skew, the fact your data are skewed doesn't rule out the normal distribution as a useful model, though it does suggest some other distribution may be more appropriate. You should also consider other properties of the data when choosing your model, besides the skew, and consider too the purposes you are going to use the model for. It's safe to say that your true population of watching times does not exactly follow some famous, named distribution, but this does not necessarily mean that such a distribution is doomed to be useless as a model. However, for some purposes you may prefer to just use the empirical distribution itself, rather than to fit a standard distribution to it.






                                                                  share|cite|improve this answer











                                                                  $endgroup$















                                                                    1












                                                                    1








                                                                    1





                                                                    $begingroup$

                                                                    "Is there a better word for that distribution?"



                                                                    There's a worthwhile distinction here between using words to describe the properties of the distribution, versus trying to find a "name" for the distribution so that you can identify it as (approximately) an instance of a particular standard distribution: one for which a formula or statistical tables might exist for its distribution function, and for which you could estimate its parameters. In this latter case, you are likely using the named distribution, e.g. "normal/Gaussian" (the two terms are generally synonymous), as a model that captures some of the key features of your data, rather than claiming the population your data is drawn from exactly follows that theoretical distribution. To slightly misquote George Box, all models are "wrong", but some are useful. If you are thinking about the modelling approach, it is worth considering what features you want to incorporate and how complicated or parsimonious you want your model to be.



                                                                    Being positively skewed is an example of describing a property that the distribution has, but doesn't come close to specifying which off-the-shelf distribution is "the" appropriate model. It does rule out some candidates, for example the Gaussian (i.e. normal) distribution has zero skew so will not be appropriate to model your data if the skew is an important feature. There may be other properties of the data that are important to you too, e.g. that it's unimodal (has just one peak) or that it is bounded between 0 and 24 hours (or between 0 and 1, if you are writing it as a fraction of the day), or that there is a probability mass concentrated at zero (since there are people who do not watch youtube at all on a given day). You may also be interested in other properties like the kurtosis. And it is worth bearing in mind that even if your distribution had a "hump" or "bell-curve" shape and had zero or near-zero skew, it doesn't automatically follow that the normal distribution is "correct" for it! On the other hand, even if the population your data is drawn from actually did follow a particular distribution precisely, due to sampling error your dataset may not quite resemble it. Small data sets are likely to be "noisy", and it may be unclear whether certain features you can see, e.g. additional small humps or asymmetric tails, are properties of the underlying population the data was drawn from (and perhaps therefore ought to be incorporated in your model) or whether they are just artefacts from your particular sample (and for modelling purposes should be ignored). If you have a small data set and the skew is close to zero, then it is even plausible the underlying distribution is actually symmetric. The larger your data set and the larger the skewness, the less plausible this becomes — but while you could perform a significance test to see how convincing is the evidence your data provides for skewness in the population it was drawn from, this may be missing the point as to whether a normal (or other zero skew) distribution is appropriate as a model ...



                                                                    Which properties of the data really matter for the purposes you are intending to model it? Note that if the skew is reasonably small and you do not care very much about it, even if the underlying population is genuinely skewed, then you might still find the normal distribution a useful model to approximate this true distribution of watching times. But you should check that this doesn't end up making silly predictions. Because a normal distribution has no highest or lowest possible value, then although extremely high or low values become increasingly unlikely, you will always find that your model predicts there is some probability of watching for a negative number of hours per day, or more than 24 hours. This gets more problematic for you if the predicted probability of such impossible events becomes high. If the skewness is so noteworthy you want to capture it as part of your model, then the skew normal distribution may be more appropriate. If you want to capture both skewness and kurtosis, then consider the skewed t. If you want to incorporate the physically possible upper and lower bounds, then consider using the truncated versions of these distributions. Many other probability distributions exist that can be skewed and unimodal (for appropriate parameter choices) such as the F or gamma distributions, and again you can truncate these so they do not predict impossibly high watching times. A beta distribution may be a good choice if you are modelling the fraction of the day spent watching, as this is always bounded between 0 and 1 without further truncation being necessary. If you want to incorporate the concentration of probability at exactly zero due to non-watchers, then consider building in a hurdle model.



                                                                    But at the point you are trying to throw in every feature you can identify from your data, and build an ever more sophisticated model, perhaps you should ask yourself why you are doing this? Would there be an advantage to a simpler model, for example it being easier to work with mathematically or having fewer parameters to estimate? If you are concerned that such simplification will leave you unable to capture all of the properties of interest to you, it may well be that no "off-the-shelf" distribution does quite what you want. However, we are not restricted to working with named distributions whose mathematical properties have been elucidated previously. Instead, consider using your data to construct an empirical distribution function. This will capture all the behaviour that was present in your data, but you can no longer give it a name like "normal" or "gamma", nor can you apply mathematical properties that pertain only to a particular distribution. For instance, the "95% of the data lies within 1.96 standard deviations of the mean" rule is for normally distributed data and may not apply to your distribution; though note that some rules apply to all distributions, e.g. Chebyshev's inequality guarantees that at least 75% of your data must lie within two standard deviations of the mean, regardless of the skew. Unfortunately the empirical distribution will also inherit all of the properties of your data set that arose just by sampling error, not just those possessed by the underlying population, so you may find a histogram of your empirical distribution has some humps and dips that the population itself does not — if you want to improve matters, you can try taking a larger sample. You may want to investigate smoothed empirical distribution functions.



                                                                    In summary: although the normal distribution has zero skew, the fact your data are skewed doesn't rule out the normal distribution as a useful model, though it does suggest some other distribution may be more appropriate. You should also consider other properties of the data when choosing your model, besides the skew, and consider too the purposes you are going to use the model for. It's safe to say that your true population of watching times does not exactly follow some famous, named distribution, but this does not necessarily mean that such a distribution is doomed to be useless as a model. However, for some purposes you may prefer to just use the empirical distribution itself, rather than to fit a standard distribution to it.






                                                                    share|cite|improve this answer











                                                                    $endgroup$



                                                                    "Is there a better word for that distribution?"



                                                                    There's a worthwhile distinction here between using words to describe the properties of the distribution, versus trying to find a "name" for the distribution so that you can identify it as (approximately) an instance of a particular standard distribution: one for which a formula or statistical tables might exist for its distribution function, and for which you could estimate its parameters. In this latter case, you are likely using the named distribution, e.g. "normal/Gaussian" (the two terms are generally synonymous), as a model that captures some of the key features of your data, rather than claiming the population your data is drawn from exactly follows that theoretical distribution. To slightly misquote George Box, all models are "wrong", but some are useful. If you are thinking about the modelling approach, it is worth considering what features you want to incorporate and how complicated or parsimonious you want your model to be.



                                                                    Being positively skewed is an example of describing a property that the distribution has, but doesn't come close to specifying which off-the-shelf distribution is "the" appropriate model. It does rule out some candidates, for example the Gaussian (i.e. normal) distribution has zero skew so will not be appropriate to model your data if the skew is an important feature. There may be other properties of the data that are important to you too, e.g. that it's unimodal (has just one peak) or that it is bounded between 0 and 24 hours (or between 0 and 1, if you are writing it as a fraction of the day), or that there is a probability mass concentrated at zero (since there are people who do not watch youtube at all on a given day). You may also be interested in other properties like the kurtosis. And it is worth bearing in mind that even if your distribution had a "hump" or "bell-curve" shape and had zero or near-zero skew, it doesn't automatically follow that the normal distribution is "correct" for it! On the other hand, even if the population your data is drawn from actually did follow a particular distribution precisely, due to sampling error your dataset may not quite resemble it. Small data sets are likely to be "noisy", and it may be unclear whether certain features you can see, e.g. additional small humps or asymmetric tails, are properties of the underlying population the data was drawn from (and perhaps therefore ought to be incorporated in your model) or whether they are just artefacts from your particular sample (and for modelling purposes should be ignored). If you have a small data set and the skew is close to zero, then it is even plausible the underlying distribution is actually symmetric. The larger your data set and the larger the skewness, the less plausible this becomes — but while you could perform a significance test to see how convincing is the evidence your data provides for skewness in the population it was drawn from, this may be missing the point as to whether a normal (or other zero skew) distribution is appropriate as a model ...



                                                                    Which properties of the data really matter for the purposes you are intending to model it? Note that if the skew is reasonably small and you do not care very much about it, even if the underlying population is genuinely skewed, then you might still find the normal distribution a useful model to approximate this true distribution of watching times. But you should check that this doesn't end up making silly predictions. Because a normal distribution has no highest or lowest possible value, then although extremely high or low values become increasingly unlikely, you will always find that your model predicts there is some probability of watching for a negative number of hours per day, or more than 24 hours. This gets more problematic for you if the predicted probability of such impossible events becomes high. If the skewness is so noteworthy you want to capture it as part of your model, then the skew normal distribution may be more appropriate. If you want to capture both skewness and kurtosis, then consider the skewed t. If you want to incorporate the physically possible upper and lower bounds, then consider using the truncated versions of these distributions. Many other probability distributions exist that can be skewed and unimodal (for appropriate parameter choices) such as the F or gamma distributions, and again you can truncate these so they do not predict impossibly high watching times. A beta distribution may be a good choice if you are modelling the fraction of the day spent watching, as this is always bounded between 0 and 1 without further truncation being necessary. If you want to incorporate the concentration of probability at exactly zero due to non-watchers, then consider building in a hurdle model.



                                                                    But at the point you are trying to throw in every feature you can identify from your data, and build an ever more sophisticated model, perhaps you should ask yourself why you are doing this? Would there be an advantage to a simpler model, for example it being easier to work with mathematically or having fewer parameters to estimate? If you are concerned that such simplification will leave you unable to capture all of the properties of interest to you, it may well be that no "off-the-shelf" distribution does quite what you want. However, we are not restricted to working with named distributions whose mathematical properties have been elucidated previously. Instead, consider using your data to construct an empirical distribution function. This will capture all the behaviour that was present in your data, but you can no longer give it a name like "normal" or "gamma", nor can you apply mathematical properties that pertain only to a particular distribution. For instance, the "95% of the data lies within 1.96 standard deviations of the mean" rule is for normally distributed data and may not apply to your distribution; though note that some rules apply to all distributions, e.g. Chebyshev's inequality guarantees that at least 75% of your data must lie within two standard deviations of the mean, regardless of the skew. Unfortunately the empirical distribution will also inherit all of the properties of your data set that arose just by sampling error, not just those possessed by the underlying population, so you may find a histogram of your empirical distribution has some humps and dips that the population itself does not — if you want to improve matters, you can try taking a larger sample. You may want to investigate smoothed empirical distribution functions.



                                                                    In summary: although the normal distribution has zero skew, the fact your data are skewed doesn't rule out the normal distribution as a useful model, though it does suggest some other distribution may be more appropriate. You should also consider other properties of the data when choosing your model, besides the skew, and consider too the purposes you are going to use the model for. It's safe to say that your true population of watching times does not exactly follow some famous, named distribution, but this does not necessarily mean that such a distribution is doomed to be useless as a model. However, for some purposes you may prefer to just use the empirical distribution itself, rather than to fit a standard distribution to it.







                                                                    share|cite|improve this answer














                                                                    share|cite|improve this answer



                                                                    share|cite|improve this answer








                                                                    edited 1 hour ago

























                                                                    answered 4 hours ago









                                                                    SilverfishSilverfish

                                                                    15.1k1567147




                                                                    15.1k1567147



























                                                                        draft saved

                                                                        draft discarded
















































                                                                        Thanks for contributing an answer to Cross Validated!


                                                                        • Please be sure to answer the question. Provide details and share your research!

                                                                        But avoid


                                                                        • Asking for help, clarification, or responding to other answers.

                                                                        • Making statements based on opinion; back them up with references or personal experience.

                                                                        Use MathJax to format equations. MathJax reference.


                                                                        To learn more, see our tips on writing great answers.




                                                                        draft saved


                                                                        draft discarded














                                                                        StackExchange.ready(
                                                                        function ()
                                                                        StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f400316%2fis-a-distribution-that-is-normal-but-highly-skewed-considered-gaussian%23new-answer', 'question_page');

                                                                        );

                                                                        Post as a guest















                                                                        Required, but never shown





















































                                                                        Required, but never shown














                                                                        Required, but never shown












                                                                        Required, but never shown







                                                                        Required, but never shown

































                                                                        Required, but never shown














                                                                        Required, but never shown












                                                                        Required, but never shown







                                                                        Required, but never shown







                                                                        Popular posts from this blog

                                                                        getting Checkpoint VPN SSL Network Extender working in the command lineHow to connect to CheckPoint VPN on Ubuntu 18.04LTS?Will the Linux ( red-hat ) Open VPNC Client connect to checkpoint or nortel VPN gateways?VPN client for linux machine + support checkpoint gatewayVPN SSL Network Extender in FirefoxLinux Checkpoint SNX tool configuration issuesCheck Point - Connect under Linux - snx + OTPSNX VPN Ububuntu 18.XXUsing Checkpoint VPN SSL Network Extender CLI with certificateVPN with network manager (nm-applet) is not workingWill the Linux ( red-hat ) Open VPNC Client connect to checkpoint or nortel VPN gateways?VPN client for linux machine + support checkpoint gatewayImport VPN config files to NetworkManager from command lineTrouble connecting to VPN using network-manager, while command line worksStart a VPN connection with PPTP protocol on command linestarting a docker service daemon breaks the vpn networkCan't connect to vpn with Network-managerVPN SSL Network Extender in FirefoxUsing Checkpoint VPN SSL Network Extender CLI with certificate

                                                                        NetworkManager fails with “Could not find source connection”Trouble connecting to VPN using network-manager, while command line worksHow can I be notified about state changes to a VPN adapterBacktrack 5 R3 - Refuses to connect to VPNFeed all traffic through OpenVPN for a specific network namespace onlyRun daemon on startup in Debian once openvpn connection establishedpfsense tcp connection between openvpn and lan is brokenInternet connection problem with web browsers onlyWhy does NetworkManager explicitly support tun/tap devices?Browser issues with VPNTwo IP addresses assigned to the same network card - OpenVPN issues?Cannot connect to WiFi with nmcli, although secrets are provided

                                                                        대한민국 목차 국명 지리 역사 정치 국방 경제 사회 문화 국제 순위 관련 항목 각주 외부 링크 둘러보기 메뉴북위 37° 34′ 08″ 동경 126° 58′ 36″ / 북위 37.568889° 동경 126.976667°  / 37.568889; 126.976667ehThe Korean Repository문단을 편집문단을 편집추가해Clarkson PLC 사Report for Selected Countries and Subjects-Korea“Human Development Index and its components: P.198”“http://www.law.go.kr/%EB%B2%95%EB%A0%B9/%EB%8C%80%ED%95%9C%EB%AF%BC%EA%B5%AD%EA%B5%AD%EA%B8%B0%EB%B2%95”"한국은 국제법상 한반도 유일 합법정부 아니다" - 오마이뉴스 모바일Report for Selected Countries and Subjects: South Korea격동의 역사와 함께한 조선일보 90년 : 조선일보 인수해 혁신시킨 신석우, 임시정부 때는 '대한민국' 국호(國號) 정해《우리가 몰랐던 우리 역사: 나라 이름의 비밀을 찾아가는 역사 여행》“남북 공식호칭 ‘남한’‘북한’으로 쓴다”“Corea 대 Korea, 누가 이긴 거야?”국내기후자료 - 한국[김대중 前 대통령 서거] 과감한 구조개혁 'DJ노믹스'로 최단기간 환란극복 :: 네이버 뉴스“이라크 "韓-쿠르드 유전개발 MOU 승인 안해"(종합)”“해외 우리국민 추방사례 43%가 일본”차기전차 K2'흑표'의 세계 최고 전력 분석, 쿠키뉴스 엄기영, 2007-03-02두산인프라, 헬기잡는 장갑차 'K21'...내년부터 공급, 고뉴스 이대준, 2008-10-30과거 내용 찾기mk 뉴스 - 구매력 기준으로 보면 한국 1인당 소득 3만弗과거 내용 찾기"The N-11: More Than an Acronym"Archived조선일보 최우석, 2008-11-01Global 500 2008: Countries - South Korea“몇년째 '시한폭탄'... 가계부채, 올해는 터질까”가구당 부채 5000만원 처음 넘어서“‘빚’으로 내몰리는 사회.. 위기의 가계대출”“[경제365] 공공부문 부채 급증…800조 육박”“"소득 양극화 다소 완화...불평등은 여전"”“공정사회·공생발전 한참 멀었네”iSuppli,08年2QのDRAMシェア・ランキングを発表(08/8/11)South Korea dominates shipbuilding industry | Stock Market News & Stocks to Watch from StraightStocks한국 자동차 생산, 3년 연속 세계 5위자동차수출 '현대-삼성 웃고 기아-대우-쌍용은 울고' 과거 내용 찾기동반성장위 창립 1주년 맞아Archived"중기적합 3개업종 합의 무시한 채 선정"李대통령, 사업 무분별 확장 소상공인 생계 위협 질타삼성-LG, 서민업종인 빵·분식사업 잇따라 철수상생은 뒷전…SSM ‘몸집 불리기’ 혈안Archived“경부고속도에 '아시안하이웨이' 표지판”'철의 실크로드' 앞서 '말(言)의 실크로드'부터, 프레시안 정창현, 2008-10-01“'서울 지하철은 안전한가?'”“서울시 “올해 안에 모든 지하철역 스크린도어 설치””“부산지하철 1,2호선 승강장 안전펜스 설치 완료”“전교조, 정부 노조 통계서 처음 빠져”“[Weekly BIZ] 도요타 '제로 이사회'가 리콜 사태 불러들였다”“S Korea slams high tuition costs”““정치가 여론 양극화 부채질… 합리주의 절실””“〈"`촛불집회'는 민주주의의 질적 변화 상징"〉”““촛불집회가 민주주의 왜곡 초래””“국민 65%, "한국 노사관계 대립적"”“한국 국가경쟁력 27위‥노사관계 '꼴찌'”“제대로 형성되지 않은 대한민국 이념지형”“[신년기획-갈등의 시대] 갈등지수 OECD 4위…사회적 손실 GDP 27% 무려 300조”“2012 총선-대선의 키워드는 '국민과 소통'”“한국 삶의 질 27위, 2000년과 2008년 연속 하위권 머물러”“[해피 코리아] 행복점수 68점…해외 평가선 '낙제점'”“한국 어린이·청소년 행복지수 3년 연속 OECD ‘꼴찌’”“한국 이혼율 OECD중 8위”“[통계청] 한국 이혼율 OECD 4위”“오피니언 [이렇게 생각한다] `부부의 날` 에 돌아본 이혼율 1위 한국”“Suicide Rates by Country, Global Health Observatory Data Repository.”“1. 또 다른 차별”“오피니언 [편집자에게] '왕따'와 '패거리 정치' 심리는 닮은꼴”“[미래한국리포트] 무한경쟁에 빠진 대한민국”“대학생 98% "외모가 경쟁력이라는 말 동의"”“특급호텔 웨딩·200만원대 유모차… "남보다 더…" 호화病, 고질병 됐다”“[스트레스 공화국] ① 경쟁사회, 스트레스 쌓인다”““매일 30여명 자살 한국, 의사보다 무속인에…””“"자살 부르는 '우울증', 환자 중 85% 치료 안 받아"”“정신병원을 가다”“대한민국도 ‘묻지마 범죄’,안전지대 아니다”“유엔 "학생 '성적 지향'에 따른 차별 금지하라"”“유엔아동권리위원회 보고서 및 번역본 원문”“고졸 성공스토리 담은 '제빵왕 김탁구' 드라마 나온다”“‘빛 좋은 개살구’ 고졸 취업…실습 대신 착취”원본 문서“정신건강, 사회적 편견부터 고쳐드립니다”‘소통’과 ‘행복’에 목 마른 사회가 잠들어 있던 ‘심리학’ 깨웠다“[포토] 사유리-곽금주 교수의 유쾌한 심리상담”“"올해 한국인 평균 영화관람횟수 세계 1위"(종합)”“[게임연중기획] 게임은 문화다-여가활동 1순위 게임”“영화속 ‘영어 지상주의’ …“왠지 씁쓸한데””“2월 `신문 부수 인증기관` 지정..방송법 후속작업”“무료신문 성장동력 ‘차별성’과 ‘갈등해소’”대한민국 국회 법률지식정보시스템"Pew Research Center's Religion & Public Life Project: South Korea"“amp;vwcd=MT_ZTITLE&path=인구·가구%20>%20인구총조사%20>%20인구부문%20>%20 총조사인구(2005)%20>%20전수부문&oper_YN=Y&item=&keyword=종교별%20인구& amp;lang_mode=kor&list_id= 2005년 통계청 인구 총조사”원본 문서“한국인이 좋아하는 취미와 운동 (2004-2009)”“한국인이 좋아하는 취미와 운동 (2004-2014)”Archived“한국, `부분적 언론자유국' 강등〈프리덤하우스〉”“국경없는기자회 "한국, 인터넷감시 대상국"”“한국, 조선산업 1위 유지(S. Korea Stays Top Shipbuilding Nation) RZD-Partner Portal”원본 문서“한국, 4년 만에 ‘선박건조 1위’”“옛 마산시,인터넷속도 세계 1위”“"한국 초고속 인터넷망 세계1위"”“인터넷·휴대폰 요금, 외국보다 훨씬 비싸”“한국 관세행정 6년 연속 세계 '1위'”“한국 교통사고 사망자 수 OECD 회원국 중 2위”“결핵 후진국' 한국, 환자가 급증한 이유는”“수술은 신중해야… 자칫하면 생명 위협”대한민국분류대한민국의 지도대한민국 정부대표 다국어포털대한민국 전자정부대한민국 국회한국방송공사about korea and information korea브리태니커 백과사전(한국편)론리플래닛의 정보(한국편)CIA의 세계 정보(한국편)마리암 부디아 (Mariam Budia),『한국: 하늘이 내린 한 폭의 그림』, 서울: 트랜스라틴 19호 (2012년 3월)대한민국ehehehehehehehehehehehehehehWorldCat132441370n791268020000 0001 2308 81034078029-6026373548cb11863345f(데이터)00573706ge128495