Skip to content Skip to sidebar Skip to footer

Concatenate Rows In A Dataframe

I have a dataframe structured like below: Column A Column B 1 A 1 B 1 C 1 D 2 B 2 C 2 D 2 E

Solution 1:

In R, we can use dplyr. After grouping by 'ColumnA', paste the contents of 'ColumnB' and create a new column with mutate

library(dplyr)
df1 %>%
     group_by(ColumnA) %>% 
     mutate(ColumnC = paste(ColumnB, collapse=""))
# A tibble: 8 x 3# Groups:   ColumnA [2]#  ColumnA ColumnB ColumnC#    <int>   <chr>   <chr>#1       1       A    ABCD#2       1       B    ABCD#3       1       C    ABCD#4       1       D    ABCD#5       2       B    BCDE#6       2       C    BCDE#7       2       D    BCDE#8       2       E    BCDE

Or another option is data.table

library(data.table)
setDT(df1)[,  ColumnC := paste(ColumnB, collapse=""), by = ColumnA]

data

df1 <- structure(list(ColumnA = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), ColumnB = c("A", 
 "B", "C", "D", "B", "C", "D", "E")), .Names = c("ColumnA", "ColumnB"
 ), class = "data.frame", row.names = c(NA, -8L))

If we need python, then

>>> import pandas as pd;
>>> df1 = pd.read_clipboard()
>>> df1
#   ColumnA ColumnB#1        1       A#2        1       B#3        1       C#4        1       D#5        2       B#6        2       C#7        2       D#8        2       E>>> df1['ColumnC'] = df1.groupby('ColumnA')['ColumnB'].transform(lambda x: ''.join(x))
>>> df1
#   ColumnA ColumnB ColumnC#1        1       A    ABCD#2        1       B    ABCD#3        1       C    ABCD#4        1       D    ABCD#5        2       B    BCDE#6        2       C    BCDE#7        2       D    BCDE#8        2       E    BCDE

Solution 2:

A one-liner in base R as suggested by @Sotos in the comment. Make sure that ColumnB of df is a character and not a factor for this solution.

with(df, ave(ColumnB, ColumnA, FUN = function(i) paste(i, collapse = '')))

Another base R solution:

df$ColumnC<-rep(unlist(by(df,INDICES = df$ColumnA,
function(t){paste(t$ColumnB,collapse = "")},simplify = F)),each=4)

>df#ColumnA ColumnB ColumnC#1       1       a    abcd#2       1       b    abcd#3       1       c    abcd#4       1       d    abcd#5       2       b    bcde#6       2       c    bcde#7       2       d    bcde#8       2       e    bcde

Post a Comment for "Concatenate Rows In A Dataframe"