{"id":2111,"date":"2023-01-27T13:30:00","date_gmt":"2023-01-27T13:30:00","guid":{"rendered":"https:\/\/ml4data.com\/?p=2111"},"modified":"2023-01-27T14:06:21","modified_gmt":"2023-01-27T14:06:21","slug":"r-tidyverse-vs-python-pandas","status":"publish","type":"post","link":"https:\/\/ml4data.com\/index.php\/2023\/01\/27\/r-tidyverse-vs-python-pandas\/","title":{"rendered":"R-Tidyverse vs Python-Pandas"},"content":{"rendered":"\t\t<div data-elementor-type=\"wp-post\" data-elementor-id=\"2111\" class=\"elementor elementor-2111\">\n\t\t\t\t\t\t\t\t\t<section class=\"ob-is-breaking-bad elementor-section elementor-top-section elementor-element elementor-element-3259aee7 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"3259aee7\" data-element_type=\"section\" data-settings=\"{&quot;_ob_bbad_use_it&quot;:&quot;yes&quot;,&quot;_ob_bbad_sssic_use&quot;:&quot;no&quot;,&quot;_ob_glider_is_slider&quot;:&quot;no&quot;}\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-7198ddbd\" data-id=\"7198ddbd\" data-element_type=\"column\" data-settings=\"{&quot;_ob_bbad_is_stalker&quot;:&quot;no&quot;,&quot;_ob_teleporter_use&quot;:false,&quot;_ob_column_hoveranimator&quot;:&quot;no&quot;,&quot;_ob_column_has_pseudo&quot;:&quot;no&quot;}\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t\t\t<div class=\"elementor-element elementor-element-2827928c ob-harakiri-inherit ob-has-background-overlay elementor-widget elementor-widget-text-editor\" data-id=\"2827928c\" data-element_type=\"widget\" data-settings=\"{&quot;_ob_use_harakiri&quot;:&quot;yes&quot;,&quot;_ob_harakiri_writing_mode&quot;:&quot;inherit&quot;,&quot;_ob_postman_use&quot;:&quot;no&quot;,&quot;_ob_perspektive_use&quot;:&quot;no&quot;,&quot;_ob_poopart_use&quot;:&quot;yes&quot;,&quot;_ob_shadough_use&quot;:&quot;no&quot;,&quot;_ob_allow_hoveranimator&quot;:&quot;no&quot;,&quot;_ob_widget_stalker_use&quot;:&quot;no&quot;}\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t<style>\/*! elementor - v3.11.5 - 14-03-2023 *\/\n.elementor-widget-text-editor.elementor-drop-cap-view-stacked .elementor-drop-cap{background-color:#818a91;color:#fff}.elementor-widget-text-editor.elementor-drop-cap-view-framed .elementor-drop-cap{color:#818a91;border:3px solid;background-color:transparent}.elementor-widget-text-editor:not(.elementor-drop-cap-view-default) .elementor-drop-cap{margin-top:8px}.elementor-widget-text-editor:not(.elementor-drop-cap-view-default) .elementor-drop-cap-letter{width:1em;height:1em}.elementor-widget-text-editor .elementor-drop-cap{float:left;text-align:center;line-height:1;font-size:50px}.elementor-widget-text-editor .elementor-drop-cap-letter{display:inline-block}<\/style>\t\t\t\t<p><!-- wp:paragraph --><\/p>\n<p>Data transformation is a crucial step in the data analysis process, and it is the process of converting raw data into a format that is more suitable for analysis. Two popular tools for data transformation are R-tidyverse and Python-pandas. Both of these tools are widely used by data scientists and analysts, but they have different strengths and weaknesses.<\/p><p><span style=\"color: var( --e-global-color-text ); font-family: var( --e-global-typography-text-font-family ), Sans-serif; font-size: 1.4rem;\">R-tidyverse is a collection of R packages designed for data science, including packages for data manipulation (dplyr, tidyr), data visualization (ggplot2), and data analysis (purrr, lubridate, etc.). Its main strength is its ability to make data manipulation and transformation simple, readable and explicit with its consistent grammar and pipe operator. Additionally, it has a wide range of tools available for data manipulation and cleaning. However, one of its weaknesses is that it can be slower than other alternatives when dealing with very large datasets.<\/span><\/p><p><span style=\"color: var( --e-global-color-text ); font-family: var( --e-global-typography-text-font-family ), Sans-serif; font-size: 1.4rem;\">Python-pandas is a powerful library for data manipulation and transformation in Python. It offers a wide range of data structures and data manipulation functions. One of its main strengths is its ability to handle large datasets efficiently. Additionally, it is widely used in the data science community and has a large number of resources available online. On the other hand, one of its weaknesses is that the syntax for data manipulation can be less explicit and more verbose than in R-tidyverse.<\/span><\/p>\n<p><\/p>\n<p><!-- \/wp:paragraph --><\/p>\n<p><!-- wp:paragraph --><\/p>\n<p>The following code example compares the data transformation process using R-tidyverse and Python-pandas<\/p>\n<p><\/p>\n<p><!-- \/wp:syntaxhighlighter\/code --><\/p>\n<p><!-- wp:paragraph --><\/p>\n<p><!-- \/wp:paragraph --><\/p>\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"ob-is-breaking-bad elementor-section elementor-top-section elementor-element elementor-element-47a0e71 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"47a0e71\" data-element_type=\"section\" data-settings=\"{&quot;_ob_bbad_use_it&quot;:&quot;yes&quot;,&quot;_ob_bbad_sssic_use&quot;:&quot;no&quot;,&quot;_ob_glider_is_slider&quot;:&quot;no&quot;}\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-63ce0a5\" data-id=\"63ce0a5\" data-element_type=\"column\" data-settings=\"{&quot;_ob_bbad_is_stalker&quot;:&quot;no&quot;,&quot;_ob_teleporter_use&quot;:false,&quot;_ob_column_hoveranimator&quot;:&quot;no&quot;,&quot;_ob_column_has_pseudo&quot;:&quot;no&quot;}\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t\t\t<div class=\"elementor-element elementor-element-f4e2a7c ob-has-background-overlay elementor-widget elementor-widget-image\" data-id=\"f4e2a7c\" data-element_type=\"widget\" data-settings=\"{&quot;_ob_photomorph_use&quot;:&quot;no&quot;,&quot;_ob_perspektive_use&quot;:&quot;no&quot;,&quot;_ob_poopart_use&quot;:&quot;yes&quot;,&quot;_ob_shadough_use&quot;:&quot;no&quot;,&quot;_ob_allow_hoveranimator&quot;:&quot;no&quot;,&quot;_ob_widget_stalker_use&quot;:&quot;no&quot;}\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t<style>\/*! elementor - v3.11.5 - 14-03-2023 *\/\n.elementor-widget-image{text-align:center}.elementor-widget-image a{display:inline-block}.elementor-widget-image a img[src$=\".svg\"]{width:48px}.elementor-widget-image img{vertical-align:middle;display:inline-block}<\/style>\t\t\t\t\t\t\t\t\t\t\t\t<img decoding=\"async\" width=\"1536\" height=\"914\" src=\"https:\/\/ml4data.com\/wp-content\/uploads\/2023\/01\/r_vs_python-1536x914.png\" class=\"attachment-1536x1536 size-1536x1536 wp-image-2116\" alt=\"\" loading=\"lazy\" srcset=\"https:\/\/ml4data.com\/wp-content\/uploads\/2023\/01\/r_vs_python-1536x914.png 1536w, https:\/\/ml4data.com\/wp-content\/uploads\/2023\/01\/r_vs_python-300x179.png 300w, https:\/\/ml4data.com\/wp-content\/uploads\/2023\/01\/r_vs_python-1024x610.png 1024w, https:\/\/ml4data.com\/wp-content\/uploads\/2023\/01\/r_vs_python-768x457.png 768w, https:\/\/ml4data.com\/wp-content\/uploads\/2023\/01\/r_vs_python-2048x1219.png 2048w, https:\/\/ml4data.com\/wp-content\/uploads\/2023\/01\/r_vs_python-600x357.png 600w\" sizes=\"(max-width: 1536px) 100vw, 1536px\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"ob-is-breaking-bad elementor-section elementor-top-section elementor-element elementor-element-6020974 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"6020974\" data-element_type=\"section\" data-settings=\"{&quot;_ob_bbad_use_it&quot;:&quot;yes&quot;,&quot;_ob_bbad_sssic_use&quot;:&quot;no&quot;,&quot;_ob_glider_is_slider&quot;:&quot;no&quot;}\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-cbcbdde\" data-id=\"cbcbdde\" data-element_type=\"column\" data-settings=\"{&quot;_ob_bbad_is_stalker&quot;:&quot;no&quot;,&quot;_ob_teleporter_use&quot;:false,&quot;_ob_column_hoveranimator&quot;:&quot;no&quot;,&quot;_ob_column_has_pseudo&quot;:&quot;no&quot;}\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t\t\t<div class=\"elementor-element elementor-element-5227cdf ob-harakiri-inherit ob-has-background-overlay elementor-widget elementor-widget-text-editor\" data-id=\"5227cdf\" data-element_type=\"widget\" data-settings=\"{&quot;_ob_use_harakiri&quot;:&quot;yes&quot;,&quot;_ob_harakiri_writing_mode&quot;:&quot;inherit&quot;,&quot;_ob_postman_use&quot;:&quot;no&quot;,&quot;_ob_perspektive_use&quot;:&quot;no&quot;,&quot;_ob_poopart_use&quot;:&quot;yes&quot;,&quot;_ob_shadough_use&quot;:&quot;no&quot;,&quot;_ob_allow_hoveranimator&quot;:&quot;no&quot;,&quot;_ob_widget_stalker_use&quot;:&quot;no&quot;}\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<p>\u00a0<\/p>\n<p>Example in R-tidyverse:<\/p>\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"ob-is-breaking-bad elementor-section elementor-top-section elementor-element elementor-element-c3e47a2 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"c3e47a2\" data-element_type=\"section\" data-settings=\"{&quot;_ob_bbad_use_it&quot;:&quot;yes&quot;,&quot;_ob_bbad_sssic_use&quot;:&quot;no&quot;,&quot;_ob_glider_is_slider&quot;:&quot;no&quot;}\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-2ec5f95\" data-id=\"2ec5f95\" data-element_type=\"column\" data-settings=\"{&quot;_ob_bbad_is_stalker&quot;:&quot;no&quot;,&quot;_ob_teleporter_use&quot;:false,&quot;_ob_column_hoveranimator&quot;:&quot;no&quot;,&quot;_ob_column_has_pseudo&quot;:&quot;no&quot;}\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t\t\t<div class=\"elementor-element elementor-element-7b390f4 ob-has-background-overlay elementor-widget elementor-widget-code-highlight\" data-id=\"7b390f4\" data-element_type=\"widget\" data-settings=\"{&quot;_ob_perspektive_use&quot;:&quot;no&quot;,&quot;_ob_poopart_use&quot;:&quot;yes&quot;,&quot;_ob_shadough_use&quot;:&quot;no&quot;,&quot;_ob_allow_hoveranimator&quot;:&quot;no&quot;,&quot;_ob_widget_stalker_use&quot;:&quot;no&quot;}\" data-widget_type=\"code-highlight.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<div class=\"prismjs-tomorrow copy-to-clipboard \">\n\t\t\t<pre data-line=\"\" class=\"highlight-height language-r line-numbers\">\n\t\t\t\t<code readonly=\"true\" class=\"language-r\">\n\t\t\t\t\t<xmp>library(tidyverse)\n\n# copy new datasest\niris_clean <- as.data.frame(iris) %>% \n\n  #Remove duplicate\n  distinct() %>% \n\n  #Rename columns\n  rename(sepal_length = Sepal.Length,\n         sepal_width = Sepal.Width,\n         petal_length = Petal.Length,\n         petal_width = Petal.Width,\n         species = Species\n         ) %>% \n\n  #Create new column\n  mutate(sepal_ratio = sepal_length\/sepal_width) %>% \n\n  #Group by species and summarize columns\n  group_by(species) %>% \n  summarize(mean_sepal_length = mean(sepal_length),\n            sd_sepal_length = sd(sepal_length),\n            mean_sepal_width = mean(sepal_width),\n            sd_sepal_width = sd(sepal_width),\n            mean_petal_length = mean(petal_length),\n            sd_petal_length = sd(petal_length),\n            mean_petal_width = mean(petal_width),\n            sd_petal_width = sd(petal_width),\n            mean_sepal_ratio = mean(sepal_ratio),\n            sd_sepal_ratio = sd(sepal_ratio)) %>% \n\n  #Sort data\n  arrange(species)<\/xmp>\n\t\t\t\t<\/code>\n\t\t\t<\/pre>\n\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"ob-is-breaking-bad elementor-section elementor-top-section elementor-element elementor-element-4f758ef elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"4f758ef\" data-element_type=\"section\" data-settings=\"{&quot;_ob_bbad_use_it&quot;:&quot;yes&quot;,&quot;_ob_bbad_sssic_use&quot;:&quot;no&quot;,&quot;_ob_glider_is_slider&quot;:&quot;no&quot;}\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-78a6b98\" data-id=\"78a6b98\" data-element_type=\"column\" data-settings=\"{&quot;_ob_bbad_is_stalker&quot;:&quot;no&quot;,&quot;_ob_teleporter_use&quot;:false,&quot;_ob_column_hoveranimator&quot;:&quot;no&quot;,&quot;_ob_column_has_pseudo&quot;:&quot;no&quot;}\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t\t\t<div class=\"elementor-element elementor-element-f1cab73 ob-harakiri-inherit ob-has-background-overlay elementor-widget elementor-widget-text-editor\" data-id=\"f1cab73\" data-element_type=\"widget\" data-settings=\"{&quot;_ob_use_harakiri&quot;:&quot;yes&quot;,&quot;_ob_harakiri_writing_mode&quot;:&quot;inherit&quot;,&quot;_ob_postman_use&quot;:&quot;no&quot;,&quot;_ob_perspektive_use&quot;:&quot;no&quot;,&quot;_ob_poopart_use&quot;:&quot;yes&quot;,&quot;_ob_shadough_use&quot;:&quot;no&quot;,&quot;_ob_allow_hoveranimator&quot;:&quot;no&quot;,&quot;_ob_widget_stalker_use&quot;:&quot;no&quot;}\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<p>\u00a0<\/p>\n<p>\u00a0<\/p>\n<p>Example in Python-pandas<\/p>\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"ob-is-breaking-bad elementor-section elementor-top-section elementor-element elementor-element-b1d6a49 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"b1d6a49\" data-element_type=\"section\" data-settings=\"{&quot;_ob_bbad_use_it&quot;:&quot;yes&quot;,&quot;_ob_bbad_sssic_use&quot;:&quot;no&quot;,&quot;_ob_glider_is_slider&quot;:&quot;no&quot;}\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-9df87a0\" data-id=\"9df87a0\" data-element_type=\"column\" data-settings=\"{&quot;_ob_bbad_is_stalker&quot;:&quot;no&quot;,&quot;_ob_teleporter_use&quot;:false,&quot;_ob_column_hoveranimator&quot;:&quot;no&quot;,&quot;_ob_column_has_pseudo&quot;:&quot;no&quot;}\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t\t\t<div class=\"elementor-element elementor-element-05240b7 ob-has-background-overlay elementor-widget elementor-widget-code-highlight\" data-id=\"05240b7\" data-element_type=\"widget\" data-settings=\"{&quot;_ob_perspektive_use&quot;:&quot;no&quot;,&quot;_ob_poopart_use&quot;:&quot;yes&quot;,&quot;_ob_shadough_use&quot;:&quot;no&quot;,&quot;_ob_allow_hoveranimator&quot;:&quot;no&quot;,&quot;_ob_widget_stalker_use&quot;:&quot;no&quot;}\" data-widget_type=\"code-highlight.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<div class=\"prismjs-tomorrow copy-to-clipboard \">\n\t\t\t<pre data-line=\"\" class=\"highlight-height language-python line-numbers\">\n\t\t\t\t<code readonly=\"true\" class=\"language-python\">\n\t\t\t\t\t<xmp>from sklearn.datasets import load_iris\nimport pandas as pd\n\n# copy new datasest\niris_data = pd.DataFrame(data=load_iris().data, columns=load_iris().feature_names)\niris_data[\"species\"] = pd.Series(load_iris().target).replace(\n    [0, 1, 2], [\"setosa\", \"versicolor\", \"virginica\"]\n)\n\niris_data_clean = (\n    # Remove duplicate\n    iris_data.drop_duplicates()\n\n    # Rename columns\n    .rename(\n        columns={\n            \"sepal length (cm)\": \"sepal_length\",\n            \"sepal width (cm)\": \"sepal_width\",\n            \"petal length (cm)\": \"petal_length\",\n            \"petal width (cm)\": \"petal_width\",\n            \"species\": \"species\",\n        }\n    )\n\n    # Create new column\n    .assign(sepal_ratio=lambda x: x[\"sepal_length\"] \/ x[\"sepal_width\"])\n    \n    # Group by species and summarize columns\n    .groupby(\"species\").agg(\n        {\n            \"sepal_length\": [\"mean\", \"std\"],\n            \"sepal_width\": [\"mean\", \"std\"],\n            \"petal_length\": [\"mean\", \"std\"],\n            \"petal_width\": [\"mean\", \"std\"],\n            \"sepal_ratio\": [\"mean\", \"std\"],\n        }\n    )\n    # Sort data\n    .sort_values(by=\"species\")\n)\n\n\n\n\n<\/xmp>\n\t\t\t\t<\/code>\n\t\t\t<\/pre>\n\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t\t\t\t<\/div>\n\t\t","protected":false},"excerpt":{"rendered":"<p>Data transformation is a crucial step in the data analysis process, and it is the process of converting raw data into a format that is more suitable for analysis. Two popular tools for data transformation are R-tidyverse and Python-pandas. Both of these tools are widely used by data scientists and analysts, but they have different [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":2115,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_mi_skip_tracking":false,"two_page_speed":[]},"categories":[1],"tags":[],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/ml4data.com\/index.php\/wp-json\/wp\/v2\/posts\/2111"}],"collection":[{"href":"https:\/\/ml4data.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/ml4data.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/ml4data.com\/index.php\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/ml4data.com\/index.php\/wp-json\/wp\/v2\/comments?post=2111"}],"version-history":[{"count":18,"href":"https:\/\/ml4data.com\/index.php\/wp-json\/wp\/v2\/posts\/2111\/revisions"}],"predecessor-version":[{"id":2131,"href":"https:\/\/ml4data.com\/index.php\/wp-json\/wp\/v2\/posts\/2111\/revisions\/2131"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/ml4data.com\/index.php\/wp-json\/wp\/v2\/media\/2115"}],"wp:attachment":[{"href":"https:\/\/ml4data.com\/index.php\/wp-json\/wp\/v2\/media?parent=2111"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/ml4data.com\/index.php\/wp-json\/wp\/v2\/categories?post=2111"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/ml4data.com\/index.php\/wp-json\/wp\/v2\/tags?post=2111"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}