# Building a pricing model for diamonds

Matrix of Plots. Distribute of price. Distribution of carat. Correlation of influenced price and carat. Relationship with color. Relationship with clarity. Relationship with cut.

Then we can see that there is no clear relationship with price on depth. When we take different values of depth, we can see that price varies the same in different values. That mean there is no relationship and therefore we do not include depth in our model.

The same situation is with the table, no relationships with the price. The price varies the same among different table values.

Now for the variables color, cut and clarity we cannot clearly see the relationship with price but we can see there is some order in these relationships. Therefore, we will investigate their relationships in our model later.

I make a histogram of the price and see that the distribution is skewed to the left, which means there are much more diamonds that are cheaper and little diamonds that are very expensive. I tried to find how to influence price so that it would have normal distribution, and I see that if I use log, then I get a distribution that is similar to normal distribution.

I did the same with carat first I see that the distribution is skewed towards the left it means that people buy much more diamonds that are small in size and there are rather small amount of very big diamonds bigger than 2 carat. I put logarithm to have a distribution that is similar to normal.

