I am trying to use use ggplot to plot production data by company and use the color of the point to designate year. The follwoing chart shows a example based on sample data:
However, often times my real data has 50-60 different comapnies wich makes the Company names on the Y axis to be tiglhtly grouped and not very asteticly pleaseing.
What is th easiest way to show data for only the top 5 companies information (ranked by 2011 quanties) and then show the rest aggregated and shown as "Other"?
Below is some sample data and the code I have used to create the sample chart:
# create some sample data
c=c("AAA","BBB","CCC","DDD","EEE","FFF","GGG","HHH","III","JJJ")
q=c(1,2,3,4,5,6,7,8,9,10)
y=c(2010)
df1=data.frame(Company=c, Quantity=q, Year=y)
q=c(3,4,7,8,5,14,7,13,2,1)
y=c(2011)
df2=data.frame(Company=c, Quantity=q, Year=y)
df=rbind(df1, df2)
# create plot
p=ggplot(data=df,aes(Quantity,Company))+
geom_point(aes(color=factor(Year)),size=4)
p
I started down the path of a brute force approach but thought there is probably a simple and elegent way to do this that I should learn. Any assistance would be greatly appreciated.
No comments:
Post a Comment