{"id":1705,"date":"2017-02-14T09:05:02","date_gmt":"2017-02-14T12:05:02","guid":{"rendered":"https:\/\/www.nachodelatorre.com.ar\/mosconi\/?p=1705"},"modified":"2017-02-14T09:05:02","modified_gmt":"2017-02-14T12:05:02","slug":"diferencia-entre-aprendizaje-automatico-ciencia-de-datos-inteligencia-artificial-aprendizaje-profundo-y-estadisticas","status":"publish","type":"post","link":"https:\/\/www.fie.undef.edu.ar\/ceptm\/?p=1705","title":{"rendered":"Diferencia entre Aprendizaje autom\u00e1tico, Ciencia de Datos, Inteligencia Artificial, Aprendizaje Profundo y Estad\u00edsticas"},"content":{"rendered":"<p>Conceptos sobre el rol del cient\u00edfico de datos, y c\u00f3mo la ciencia de los datos compara y superpone con campos relacionados como el aprendizaje autom\u00e1tico, el aprendizaje profundo, la IA, las estad\u00edsticas, la IOT, la investigaci\u00f3n operativa y las matem\u00e1ticas aplicadas.<!--more--><\/p>\n<p>In this article, I clarify the various roles of the data scientist, and how data science compares and overlaps with related fields such as machine learning, deep learning, AI, statistics, IoT, operations research, and applied mathematics. As data science is a broad discipline, I start by describing the different types of data scientists that one may encounter in any business setting: you might even discover that you are a data scientist yourself, without knowing it. As in any scientific discipline, data scientists may borrow techniques from related disciplines, though we have developed our own arsenal, especially techniques and algorithms to handle very large unstructured data sets in automated ways, even without human interactions, to perform transactions in real-time or to make predictions.<\/p>\n<p><a href=\"http:\/\/api.ning.com:80\/files\/-q6tvS*EPsLDs8MwQuRIsxLEFTbYfjyLDIqHpbih0OnE5dry2VTTzwfUcIrZ5zvXz9NIUHqPPpih2O4T*VUe6yCnt-yd*4Gr\/Capturevcx.PNG\" target=\"_self\" rel=\"noopener noreferrer\"><img class=\"align-center aligncenter\" src=\"http:\/\/api.ning.com:80\/files\/-q6tvS*EPsLDs8MwQuRIsxLEFTbYfjyLDIqHpbih0OnE5dry2VTTzwfUcIrZ5zvXz9NIUHqPPpih2O4T*VUe6yCnt-yd*4Gr\/Capturevcx.PNG\" alt=\"\" width=\"489\" \/><\/a><\/p>\n<p><span class=\"font-size-4\"><strong>1. Different Types of Data Scientists<\/strong><\/span><\/p>\n<p>To get started and gain some historical perspective, you can read my article about <a href=\"http:\/\/www.datasciencecentral.com\/profiles\/blogs\/six-categories-of-data-scientists\" target=\"_blank\" rel=\"noopener noreferrer\">9 types of data scientists<\/a>, published in 2014, or my article \u00a0where I compare data science with <a href=\"http:\/\/www.datasciencecentral.com\/profiles\/blogs\/17-analytic-disciplines-compared\" target=\"_blank\" rel=\"noopener noreferrer\">16 analytic disciplines<\/a>, also published in 2014.<\/p>\n<p>The following articles, published during the same time period, are still useful:<\/p>\n<ul>\n<li><a href=\"http:\/\/www.datasciencecentral.com\/profiles\/blogs\/data-scientist-versus-data-architect\">Data Scientist versus Data Architect<\/a><\/li>\n<li><a href=\"http:\/\/www.datasciencecentral.com\/profiles\/blogs\/data-scientist-versus-data-engineer\">Data Scientist versus Data Engineer<\/a><\/li>\n<li><a href=\"http:\/\/www.datasciencecentral.com\/profiles\/blogs\/data-scientist-versus-statistician\">Data Scientist versus Statistician<\/a><\/li>\n<li><a href=\"http:\/\/www.datasciencecentral.com\/profiles\/blogs\/data-scientist-versus-business-analyst\">Data Scientist versus Business Analyst<\/a><\/li>\n<\/ul>\n<p>More recently (August 2016)\u00a0<b>\u00a0<\/b><a href=\"http:\/\/www.datasciencecentral.com\/profile\/ajitjaokar\" target=\"_blank\" rel=\"noopener noreferrer\">Ajit Jaokar<\/a>\u00a0discussed Type A (Analytics) versus Type B (Builder) data scientist:<\/p>\n<ul>\n<li><em>The Type A Data Scientist can code well enough to work with data but is not necessarily an expert. The Type A data scientist may be an expert in experimental design, forecasting, modelling, statistical inference, or other things typically taught in statistics departments. Generally speaking though, the work product of a data scientist is not &#8220;p-values and confidence intervals&#8221; as academic statistics sometimes seems to suggest (and as it sometimes is for traditional statisticians working in the pharmaceutical industry, for example). At Google, Type A Data Scientists are known variously as Statistician, Quantitative Analyst, Decision Support Engineering Analyst, or Data Scientist, and probably a few more.<\/em><\/li>\n<\/ul>\n<ul>\n<li><em>Type B Data Scientist: The B is for Building. Type B Data Scientists share some statistical background with Type A, but they are also very strong coders and may be trained software engineers. The Type B Data Scientist is mainly interested in using data &#8220;in production.&#8221; They build models which interact with users, often serving recommendations (products, people you may know, ads, movies, search results).\u00a0<\/em><em>Source: click <a href=\"http:\/\/www.kdnuggets.com\/2016\/08\/become-type-a-data-scientist.html\" target=\"_blank\" rel=\"noopener noreferrer\">here<\/a>.<\/em><\/li>\n<\/ul>\n<p>I also wrote about the <a href=\"http:\/\/www.datasciencecentral.com\/profiles\/blogs\/the-abcd-s-of-business-optimization\" target=\"_blank\" rel=\"noopener noreferrer\">ABCD&#8217;s of business processes optimization<\/a>\u00a0where D stands for data science, C for computer science, B for business science, and A for analytics science. Data science may or may not involve coding or mathematical practice, as you can read in my article on\u00a0<a href=\"http:\/\/www.datasciencecentral.com\/profiles\/blogs\/high-level-versus-low-level-data-science\" target=\"_blank\" rel=\"noopener noreferrer\">low-level versus high-level data science<\/a>. In a startup, data scientists generally wear several hats, such as executive, data miner, data engineer or architect, researcher, statistician, modeler (as in predictive modeling) or developer.<\/p>\n<p>While the data scientist is generally portrayed as a coder experienced in R, Python, SQL, Hadoop and statistics, this is just the tip of the iceberg, made popular by data camps focusing on teaching some elements of data science. But just like a lab technician can call herself a physicist, the real physicist is much more than that, and her domains of expertise are varied: astronomy, mathematical physics, nuclear physics (which is borderline chemistry), mechanics, electrical engineering, signal processing (also a sub-field of data science) and many more. The same can be said about data scientists: fields are as varied as bioinformatics, information technology, simulations and quality control, computational finance, epidemiology, industrial engineering, <a href=\"http:\/\/www.datasciencecentral.com\/profiles\/blogs\/prime-numbers-interesting-distribution-and-density-results\" target=\"_blank\" rel=\"noopener noreferrer\">and even number theory<\/a>.<\/p>\n<p>In my case, over the last 10 years, I specialized in machine-to-machine and device-to-device communications, developing systems to automatically process large data sets, to perform automated transactions: for instance, purchasing Internet traffic or automatically generating content. It implies developing algorithms that work with unstructured data, and it is at the intersection of AI (artificial intelligence,) IoT (Internet of things,) and data science. This is referred \u00a0to as <a href=\"http:\/\/www.datasciencecentral.com\/profiles\/blogs\/8-deep-data-science-articles\" target=\"_blank\" rel=\"noopener noreferrer\">deep data science<\/a>. It is relatively math-free, and it involves relatively little coding (mostly API&#8217;s), but it is quite data-intensive (including building data systems) and based on brand new statistical technology designed specifically for this context.<\/p>\n<p>Prior to that, I worked on credit card fraud detection in real time. Earlier in my career (circa 1990) I worked on image remote sensing technology, among other things to identify patterns (or shapes or features, for instance lakes)\u00a0in satellite images and to perform image segmentation: at that time my research was labeled as computational statistics, but the people doing the exact same thing in the computer science department next door\u00a0in my home university, called their research artificial intelligence. Today, it would be called data science or artificial intelligence, the sub-domains being signal processing, computer vision or IoT.<\/p>\n<p>Also, data scientists can be found anywhere in the <a href=\"http:\/\/www.datasciencecentral.com\/profiles\/blogs\/life-cycle-of-data-science-projects?xg_source=activity\" target=\"_blank\" rel=\"noopener noreferrer\">lifecycle of data science projects<\/a>, at the data gathering stage, or the data exploratory stage, all the way up to statistical modeling and maintaining existing systems.<\/p>\n<p><span class=\"font-size-4\"><strong>2. Machine Learning versus Deep Learning<\/strong><\/span><\/p>\n<p>Before digging deeper into the link between data science and machine learning, let&#8217;s briefly discuss machine learning and deep learning. Machine learning is a set of algorithms that train on a data set to make predictions or take actions in order to optimize some systems. For instance, supervised classification algorithms are used to classify potential clients into good or bad prospects, for loan purposes, based on historical data. The techniques involved, for a given task (e.g. supervised clustering), are varied: naive Bayes, SVM, neural nets, ensembles, association rules, decision trees, logistic regression, or a combination of many. For a detailed list of algorithms, <a href=\"http:\/\/www.datasciencecentral.com\/profiles\/blogs\/top-10-machine-learning-algorithms\" target=\"_blank\" rel=\"noopener noreferrer\">click here<\/a>. For a list of machine learning problems, <a href=\"http:\/\/www.datasciencecentral.com\/profiles\/blogs\/top-20-uses-of-statistical-modeling\" target=\"_blank\" rel=\"noopener noreferrer\">click here<\/a>.<\/p>\n<p>All of this is a subset of data science. When these algorithms are automated, as in automated piloting or driver-less cars, it is called AI, and more specifically, deep learning. <a href=\"http:\/\/www.datasciencecentral.com\/profiles\/blogs\/deep-learning-definition-resources-comparison-with-machine-learni\" target=\"_blank\" rel=\"noopener noreferrer\">Click here<\/a>\u00a0for another article comparing machine learning with deep learning.\u00a0If the data collected comes from sensors and if it is transmitted via the Internet, then it is machine learning or data science or deep learning applied to IoT.<\/p>\n<p>Some people have a different definition for deep learning. They consider deep learning as neural networks (a machine learning technique) with a deeper layer. The question was asked on Quora recently, and below is a more detailed explanation (source: <a href=\"https:\/\/www.quora.com\/What-is-the-difference-between-AI-Machine-Learning-NLP-and-Deep-Learning\/answer\/Dmitriy-Genzel?ref=t_page\" target=\"_blank\" rel=\"noopener noreferrer\">Quora<\/a>)<\/p>\n<ul>\n<li><em>AI (<span class=\"qlink_container\"><a class=\"external_link\" href=\"https:\/\/en.wikipedia.org\/wiki\/Artificial_intelligence\" target=\"_blank\" rel=\"noopener nofollow noreferrer\">Artificial intelligence<\/a><\/span>) is a subfield of computer science, that was created in the 1960s, and it was (is) concerned with solving tasks that are easy for humans, but hard for computers. In particular, a so-called Strong AI would be a system that can do anything a human can (perhaps without purely physical things). This is fairly generic, and includes all kinds of tasks, such as planning, moving around in the world, recognizing objects and sounds, speaking, translating, performing social or business transactions, creative work (making art or poetry), etc.<\/em><\/li>\n<\/ul>\n<ul>\n<li><em>NLP (<span class=\"qlink_container\"><a class=\"external_link\" href=\"https:\/\/en.wikipedia.org\/wiki\/Natural_language_processing\" target=\"_blank\" rel=\"noopener nofollow noreferrer\">Natural language processing<\/a><\/span>) is simply the part of AI that has to do with language (usually written).<\/em><\/li>\n<\/ul>\n<ul>\n<li><em><span class=\"qlink_container\"><a class=\"external_link\" href=\"https:\/\/en.wikipedia.org\/wiki\/Machine_learning\" target=\"_blank\" rel=\"noopener nofollow noreferrer\">Machine learning<\/a><\/span>\u00a0is concerned with one aspect of this: given some AI problem that can be described in discrete terms (e.g. out of a particular set of actions, which one is the right one), and given a lot of information about the world, figure out what is the \u201ccorrect\u201d action, without having the programmer program it in. Typically some outside process is needed to judge whether the action was correct or not. In mathematical terms, it\u2019s a function: you feed in some input, and you want it to to produce the right output, so the whole problem is simply to build a model of this mathematical function in some automatic way. To draw a distinction with AI, if I can write a very clever program that has human-like behavior, it can be AI, but unless its parameters are automatically learned from data, it\u2019s not machine learning.<\/em><\/li>\n<\/ul>\n<ul>\n<li><em><span class=\"qlink_container\"><a class=\"external_link\" href=\"https:\/\/en.wikipedia.org\/wiki\/Deep_learning\" target=\"_blank\" rel=\"noopener nofollow noreferrer\">Deep learning<\/a><\/span>\u00a0is one kind of machine learning that\u2019s very popular now. It involves a particular kind of mathematical model that can be thought of as a composition of simple blocks (function composition) of a certain type, and where some of these blocks can be adjusted to better predict the final outcome.<\/em><\/li>\n<\/ul>\n<p><strong>What is the difference between machine learning and statistics?<\/strong><\/p>\n<p><a href=\"http:\/\/www.edvancer.in\/machine-learning-vs-statistics\/\" target=\"_blank\" rel=\"noopener noreferrer\">This article<\/a>\u00a0tries to answer the question. The author writes that statistics is machine learning with confidence intervals for the quantities being predicted or estimated. I tend to disagree, as I have built <a href=\"http:\/\/www.datasciencecentral.com\/profiles\/blogs\/black-box-confidence-intervals-excel-and-perl-implementations-det\" target=\"_blank\" rel=\"noopener noreferrer\">engineer-friendly confidence intervals<\/a>\u00a0that don&#8217;t require any mathematical or statistical knowledge.<\/p>\n<p><span class=\"font-size-4\"><strong>3. Data Science versus Machine Learning<\/strong><\/span><\/p>\n<p>Machine learning and statistics are part of data science. The word <em>learning<\/em> in machine learning means that the algorithms depend on some data, used as a training set, to fine-tune some model or algorithm parameters. This encompasses many techniques such as regression, naive Bayes or supervised clustering. But not all techniques fit in this category. For instance, unsupervised clustering &#8211; a statistical and data science technique &#8211; aims at detecting clusters and cluster structures without any a-priori knowledge or training set to help the classification algorithm. A human being is needed to label the clusters found. Some techniques are hybrid, such as semi-supervised classification. Some pattern detection or density estimation techniques fit in this category.<\/p>\n<p>Data science is much more than machine learning though. Data, in data science, may or may not come from a <em>machine<\/em>\u00a0or mechanical process (survey data could be manually collected, clinical trials involve a specific type of small data) \u00a0and it might have nothing to do with <em>learning<\/em> as I have just discussed. But the main difference is the fact that data science covers the whole spectrum of data processing, not just the algorithmic or statistical aspects. In particular, data science also covers<\/p>\n<ul>\n<li>data integration<\/li>\n<li>distributed architecture<\/li>\n<li>automating machine learning<\/li>\n<li>data visualization<\/li>\n<li>dashboards and BI<\/li>\n<li>data engineering<\/li>\n<li>deployment in production mode<\/li>\n<li>automated, data-driven decisions<\/li>\n<\/ul>\n<p>Of course, in many organisations, data scientists focus on only one part of this process. To read about some of my original contributions to data science, <a href=\"http:\/\/www.datasciencecentral.com\/profiles\/blogs\/my-data-science-machine-learning-and-related-articles\" target=\"_blank\" rel=\"noopener noreferrer\">click here<\/a>.<\/p>\n<p><strong>Fuente:<\/strong> <em><a href=\"http:\/\/www.datasciencecentral.com\/profiles\/blogs\/difference-between-machine-learning-data-science-ai-deep-learning\" target=\"_blank\" rel=\"noopener noreferrer\">http:\/\/www.datasciencecentral.com<\/a><\/em><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Conceptos sobre el rol del cient\u00edfico de datos, y c\u00f3mo la ciencia de los datos compara y superpone con campos relacionados como el aprendizaje autom\u00e1tico,&hellip; <\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[23,29],"tags":[],"_links":{"self":[{"href":"https:\/\/www.fie.undef.edu.ar\/ceptm\/index.php?rest_route=\/wp\/v2\/posts\/1705"}],"collection":[{"href":"https:\/\/www.fie.undef.edu.ar\/ceptm\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.fie.undef.edu.ar\/ceptm\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.fie.undef.edu.ar\/ceptm\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.fie.undef.edu.ar\/ceptm\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1705"}],"version-history":[{"count":0,"href":"https:\/\/www.fie.undef.edu.ar\/ceptm\/index.php?rest_route=\/wp\/v2\/posts\/1705\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.fie.undef.edu.ar\/ceptm\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1705"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.fie.undef.edu.ar\/ceptm\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1705"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.fie.undef.edu.ar\/ceptm\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1705"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}