My DS Coding Bolg: 2016

Saturday, December 31, 2016

Mac, Excel and Jupyter Hotkeys

Excel

Paste cmd+v
Copy cmd+c
Cut cmd+x
Paste special cmd+ctrl+v

Clear delete
Undo cmd+z
Redo cmd+y

New blank workbook cmd+n
Print cmd+p
Save cmd+s
Close window cmd + w
Quit Excel cmd + q

Underline cmd+u
Italic cmd+i
Bold cmd+b

Select all cmd+a
Add or remove a filter ctrl+shift+l

Fill Down cmd+shift+down, cntl+d
Fill Right cmd+shift+right, cntl+d

Screen right fn+option+arrow down
Screen left fn+option+arrow up
Move to Last cell fn+control+arrow right
Move to first cell fn+control+arrow left

Display the Go To dialog box cntl+g
Display the Format Cells dialog box cmd + 1
Display the Replace dialog box cntl+h
Display the Save As dialog box cmd+shift+s
Display the Open dialog box cmd+o

Jupyter

Tuesday, December 27, 2016

Hello TensorFlow

In November, 2015, Google open-sourced its numerical computation library called TensorFlow using data flow graphs. Its flexible implementation and architecture enables you to focus on building the computation graph and deploy the model with little efforts on heterogeous platforms such as mobile devices, hundreds of machines, or thousands of computational devices.

#########################################################

## Installations of TensorFlow
#########################################################
Anaconda is a Python distribution that includes a large number of standard numeric and scientific computing packages. Anaconda uses a package manager called 'condo' hat has its own environment system similar to Virtualenv.

- Install Anaconda

- Create a condo environment

conda create -n tensorflow python=3.6

conda install -c conda-forge tensorflow

conda install ipython

conda install jupyter

which python

which ipython

which jupyter

- Activate the condo environment and install TensorFlow in it.

source activate tensor flow

- After the install you will activate the condo environment each time you want to use TensorFlow.

export TF_BINARY_URL=https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.12.0rc0-cp27-none-linux_x86_64.whl

- Optionally install iPython and other packages into the condo environment.

source activate tensor flow

source deactivate

Install Python

### Python commands

快速安装python命令行工具

```

python3 -m pip install --user pipx

python3 -m pipx ensurepath

```

Pipenv自动为您的项目创建和管理virtualenv，以及在安装/卸载软件包时从Pipfile添加/删除软件包。它还生成了非常重要的Pipfile.lock文件，用于生成确定性构建。

```

pipx install pipenv

```

Black是代码格式化工具, 产生的代码差异最小，可以加速代码审查.

isort是可以按字母顺序对 import 进行排序，并自动分成多个部分。

```

pipenv install black isort --dev

```

setup.cfg config

```

[isort]

multi_line_output=3

include_trailing_comma=True

force_grid_wrap=0

use_parentheses=True

line_length=88

```

use black and isort

```

pipenv run black

pipenv run isort

```

cookiecutter生成项目

```

pipx run cookiecutter gh:sourceryai/python-best-practices-cookiecutter

```

#############################
# Test the TensorFlow installation
#############################

python
...
>>> import tensorflow as tf
>>> hello = tf.constant('Hello, TensorFlow!')
>>> sess = tf.Session()
>>> print(sess.run(hello))
Hello, TensorFlow!
>>> a = tf.constant(10)
>>> b = tf.constant(32)
>>> print(sess.run(a + b))
42

###################################
# Run TensorFlow from the Command Line

###################################
>>> import os;
>>> import inspect;
>>> import tensorflow;
>>> print(os.path.dirname(inspect.getfile(tensorflow)));
/Users/tkmaemd/anaconda/envs/tensorflow/lib/python2.7/site-packages/tensorflow

(tensorflow) NY-C02MW0YGFD58:~ tkmaemd$ python -c 'import os; import inspect; import tensorflow; print(os.path.dirname(inspect.getfile(tensorflow)))'
/Users/tkmaemd/anaconda/envs/tensorflow/lib/python2.7/site-packages/tensorflow

###################################
# Basic Usage
###################################
TensorFlow programs are usually structured into a construction phase, that assembles a graph, and an execution phase that uses a session to execute ops in the graph.

For example, it is common to create a graph to represent and train a neural network in the construction phase, and then repeatedly execute a set of training ops in the graph in the execution phase.

# Building the graph

import tensorflow as tf
matrix1 = tf.constant([[3., 3.]])
matrix2 = tf.constant([[2.],[2.]])
product = tf.matmul(matrix1, matrix2)

# Launch the default graph

sess = tf.Session()
result = sess.run(product)
print(result)
sess.close()

# Interactive Usage

import tensorflow as tf
sess = tf.InteractiveSession()

x = tf.Variable([1.0, 2.0])
a = tf.constant([3.0, 3.0])
x.initializer.run()
sub = tf.sub(x, a)
print(sub.eval())
# ==> [-2. -1.]
sess.close()

# Variables

state = tf.Variable(0, name="counter")
one = tf.constant(1)
new_value = tf.add(state, one)
update = tf.assign(state, new_value)

init_op = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init_op)
print(sess.run(state))

for _ in range(3):
sess.run(update)
print(sess.run(state))

# Fetches

input1 = tf.constant([3.0])
input2 = tf.constant([2.0])
input3 = tf.constant([5.0])
intermed = tf.add(input2, input3)
mul = tf.mul(input1, intermed)

with tf.Session() as sess:
result = sess.run([mul, intermed])
print(result)

# Feeds

input1 = tf.placeholder(tf.float32)
input2 = tf.placeholder(tf.float32)
output = tf.mul(input1, input2)

with tf.Session() as sess:
print(sess.run([output], feed_dict={input1:[7.], input2:[2.]}))

###################################
# Hello World

###################################

import tensorflow as tf

h = tf.constant("Hello")

w = tf.constant(" World!")

hw = h + w

with tf.Session() as less:

ans = sess.run(hw)

print ans

###################################
# Run a TensorFlow demo model
###################################

cd /Users/tkmaemd/anaconda/envs/tensorflow/lib/python2.7/site-
packages/tensorflow/models/image/mnist

###################################
# Introduction
###################################
source activate py35
source activate tensor flow
ipython
source deactivate tensor flow
source deactivate py35

import tensorflow as tf
import numpy as np

# Create 100 phony x, y data points in NumPy, y = x * 0.1 + 0.3
x_data = np.random.rand(100).astype(np.float32)
y_data = x_data * 0.1 + 0.3

# Try to find values for W and b that compute y_data = W * x_data + b
# (We know that W should be 0.1 and b 0.3, but TensorFlow will
# figure that out for us.)
W = tf.Variable(tf.random_uniform([1], -1.0, 1.0))
b = tf.Variable(tf.zeros([1]))
y = W * x_data + b

# Minimize the mean squared errors.
loss = tf.reduce_mean(tf.square(y - y_data))
optimizer = tf.train.GradientDescentOptimizer(0.5)
train = optimizer.minimize(loss)

# Before starting, initialize the variables. We will 'run' this first.
init = tf.global_variables_initializer()

# Launch the graph.
sess = tf.Session()
sess.run(init)

# Fit the line.
for step in range(201):
sess.run(train)
if step % 20 == 0:
print(step, sess.run(w), sess.run(b))

# Learns best fit is W: [0.1], b: [0.3]

Friday, December 23, 2016

Executive Briefings

Up-to-the-minute news, executive briefings, global retail and VM trends and trend confirmations are all covered here to help hone your strategies for retail, customer communications and business.
Uncover up-and-coming trends, in-depth reports on marketing strategies and experience design
An annual calendar to help you plan for the most important industry events.

Kroger CEO, Rodney McMullen.

win over not only high end, but also low end.

The economy continues slowly improve, and customers continue to feel more optimistic, but the bifurcation in the economy remains. Some consumers are willing to spend more while others are worried about their job or next paycheck, or more focused on saving. We find all customers want quality products and a great shopping experience. For the customer who is more focused on natural and organic products we have our won Simple Truth products and a great shopping experience. We also have many entry-level price point items of excellent quality. For customer looking for incredibly high quality products, like Boar’s Head or Murray’s Cheese, just to name a couple, we have that, too.

Our job is to understand and deliver for our diverse set of customers so they can save where they want to save and splurge where they want to splurge.

Kohls CEO, Kevin Mansell

Even more importantly, traffic in our stores was extremely strong. Our stores enjoyed a solid increase in sales for the three days combined. On an annual basis, about 80% of our business is done in stores, so it was exciting to see so many customers enjoying the experience of shopping together with family and friends this weekend.

Solid activity in store, combined with increased online business, is how we will succeed as an omni-channel retailer. At the core of this omnichannel transformation is the need to evolve the way our Merchandise organization works-- combining our separate E-commerce and Store Merchandise and Planning organizations into one unified omnichannel team. We are seeing our online and in-store experience work very well together as nearly half of our online orders were fulfilled either through Buy Online Pick Up in Store or Ship from Store. We are driving online shoppers to our stores, and our stores are making a better online experience by cutting shipping time down and increasing available inventory.

From an Incredible Savings standpoint, many of our competitors can offer great savings on items at any point. What makes us stand out is our loyalty efforts that are ongoing — from K's marketing program to the incredible value of the K's app and mobile wallet. At no time was that more evident than this past weekend. I believe that rewarding our customers through loyalty is how we will drive positive customer behaviors and differentiate ourselves for the rest of this season and for the future.

Airbnb CEO, Brian Chesky

Put Customers First. Here is an elegantly simple but powerful viewpoint. If you want to create a great product, just focus on one person. Make that person have the most amazing experience ever.

Berkshire-Hathaway CEO, Warren Buffett

Act with Integrity. It takes 20 years to build a reputation and five minutes to ruin it. If you think about that, you'll do things differently.

Facebook CEO, Mark Zuckerberg

Build Great Teams. How does Facebook stay relevant in a space where things change in a nanosecond? If you’re in an environment where you’re not learning as much as you think you should be, if you don’t have the people around you who you think are going to inspire you to do the best work that you can, then think about changing something. Because that’s a big deal.

Pepsi CEO, Indra Nooyi

Drive Results. You’ve got to look at the investments you make in the company as a portfolio. There’s a bunch of stuff that delivers in the short term. That gives you the breathing room and the fodder to invest in the long term.

If you're not prepared to be wrong, you'll never come up with anything original.
Sir Ken Robinson, TED 2006 (#1 TED talk)

WeChat Case

WeChat in Sep 2016 geo-location based in stream Moments advertising was launched to offer more segmented targeting. The app has also launched a banner ad format allowing advertisers to pick their preferred official accounts. Rates are based on page views.

Popular topics include astrology, humour, self-improvement, health and wellness, cooking, fitness, travel, news, entertainment, and parenthood.

Thursday, December 8, 2016

Open Data Sources

DataHub (http://datahub.io/dataset)

World Health Organization (http://www.who.int/research/en/)

Data.gov (http://data.gov)

European Union Open Data Portal (http://open-data.europa.eu/en/data/)

Amazon Web Service public datasets (http://aws.amazon.com/datasets)

Facebook Graph (http://developers.facebook.com/docs/graph-api)

Healthdata.gov (http://www.healthdata.gov)

Google Trends (http://www.google.com/trends/explore)

Google Finance (https://www.google.com/finance)

Google Books Ngrams (http://storage.googleapis.com/books/ngrams/books/datasetsv2.html)

Machine Learning Repository (http://archive.ics.uci.edu/ml/)

Other Popular open data repositories:

UC Irvine Machine Learning Repository

Kaggle datasets

Amazon’s AWS datasets

Meta portals (they list open data repositories):

http://dataportals.org/

http://opendatamonitor.eu/

http://quandl.com/

Other pages listing many popular open data repositories:

Wikipedia’s list of Machine Learning datasets (https://goo.gl/SJHN2k)

Quora.com question (http://goo.gl/zDR78y)

Visual 13 - Create Pie Chart

dat1 <- dbGetQuery(conn,"select channel, sum(a.sld_qty)
from eipdb_sandbox.ling_sls_brnd_demog a
where a.gma_nbr in (2,5)
and a.trn_sls_dte between '2016-11-14' AND '2016-12-16'
group by 1
order by 1;")
dat1$perc <- dat1$sld_qty/sum(dat1$sld_qty)
p1 <- ggplot(dat1, aes(x = factor(1), y =sld_qty, fill = channel)) +
geom_bar(width = 1, stat = "identity") +
scale_fill_manual(values = c("red", "blue")) +
coord_polar(theta="y", start = pi / 3) +
##labs(title = "Kohl's Sold Items by Channel") +
geom_text_repel(aes(label=scales::percent(perc)), size=4.5) + ylab("") + xlab("") +
theme_void()

dat2 <-dbGetQuery(conn,"select new_ind, sum(a.sld_qty)
from eipdb_sandbox.ling_sls_brnd_demog a
where a.gma_nbr in (2,5)
and a.trn_sls_dte between '2016-11-14' AND '2016-12-16'
group by 1
order by 1;")
dat2$new_ind <- factor(dat2$new_ind)
dat2$perc <- dat2$sld_qty/sum(dat2$sld_qty)
p2 <- ggplot(dat2, aes(x = "", y =sld_qty, fill = new_ind)) +
geom_bar(width = 1, stat = "identity") +
scale_fill_manual(values = c("darkgreen", "orangered", "red")) +
coord_polar("y", start = pi / 3) +
##labs(title = "Kohl's Sold Items by New/Existed Customer") +
geom_text_repel(aes(label=scales::percent(perc)), size=4.5) + ylab("") + xlab("") +
theme_void()

dat3 <-dbGetQuery(conn,"select sku_stat_desc, sum(a.sld_qty)
from eipdb_sandbox.ling_sls_brnd_demog a
where a.gma_nbr in (2,5)
and a.trn_sls_dte between '2016-11-14' AND '2016-12-16'
group by 1
order by 1;")
dat3$perc <- dat3$sld_qty/sum(dat3$sld_qty)
dat3 <- dat3[order(dat3$sld_qty),]
p3 <- ggplot(dat3, aes(x = "", y =sld_qty, fill = sku_stat_desc)) +
geom_bar(width = 1, stat = "identity") +
scale_fill_brewer(palette = "Spectral") +
coord_polar("y", start = pi / 3) +
##labs(title = "Kohl's Sold Items by SKU Status") +
geom_text_repel(aes(label=scales::percent(perc)), size=4.5) + ylab("") + xlab("") +
theme_void()

grid.arrange(p1,p2,p3, nrow=3, ncol=1)

Wednesday, December 7, 2016

Visual12 - Create Maps

http://bcb.dfci.harvard.edu/~aedin/courses/R/CDC/maps.html

#########################################################
## Geographic Information of Customers
#########################################################
# Returns centroids
getLabelPoint <- function(county) {
Polygon(county[c('long', 'lat')])@labpt}
df <- map_data("state")
centroids <- by(df, df$region, getLabelPoint) # Returns list
centroids <- do.call("rbind.data.frame", centroids) # Convert to Data Frame
names(centroids) <- c('long', 'lat') # Appropriate Header
centroids$states <- rownames(centroids)

dat8<-dbGetQuery(conn,"select demand_state,
count(distinct mstr_persona_key) cust
from eipdb_sandbox.ling_sls_brnd_demog
where new_ind=1 and trn_sls_dte between '2014-11-01' and '2016-10-31'
group by 1
order by 1;
")
## Join with States
dat8$states <- tolower(state.name[match(dat8$demand_state, state.abb)])

states <- map_data("state")
head(states)
dat9 <- merge(dat8, centroids, by="states")
dat9$statelabel <- paste(dat9$demand_state, "\n", format(dat9$cust, big.mark = ",", scientific = F), sep="")

# ggplot(data = Total) +
# geom_polygon(aes(x = long, y = lat, fill = region, group = group), color = "white") +
# coord_fixed(1.3) +
# guides(fill=FALSE) +
# geom_text(data=statelable, aes(x=long, y=lat, label = demand_state), size=2)

ggplot() +
geom_map(data=states, map=states,
aes(x=long, y=lat, map_id=region),
fill="#ffffff", color="#ffffff", size=0.15) +
geom_map(data=dat8, map=states,
aes(fill=cust, map_id=states),
color="#ffffff", size=0.15) +
coord_fixed(1.3) +
scale_fill_continuous(low = "thistle2", high = "darkred", guide="colorbar") +
#scale_fill_distiller(name="Customers", palette = "YlGn", breaks=pretty_breaks(n=5)) +
#geom_text(data=dat9, hjust=0.5, vjust=-0.5, aes(x=long, y=lat, label=statelabel), colour="black", size=4 ) +
geom_text(data=dat9, aes(x=long, y=lat, label=statelabel), colour="black", size=4 ) +
ggtitle("Customers from 11/1/2014 to 10/31/2016") + ylab("") + xlab("") +
theme(plot.title = element_text(face = "bold", size = 20)) +
theme(axis.text.x = element_text(face = "bold", size = 14)) +
theme(axis.text.y = element_text(face = "bold", size = 14)) +
theme(axis.title.x = element_text(face = "bold", size = 16)) +
theme(strip.text.x = element_text(face = "bold", size = 16)) +
theme(axis.title.y = element_text(face = "bold", size = 16, angle=90)) +
guides(fill=FALSE)

## Plot2
## American Community Survey (ACS) Data
## Join with States
## access population estimates for US States in 2012
?df_pop_state
data(df_pop_state)
head(df_pop_state)
dat10 <- merge(dat9, df_pop_state, by.x="states", by.y="region")
dat10$perc <- dat10$cust/dat10$value
percent <- function(x, digits = 2, format = "f", ...) {
paste0(formatC(100 * x, format = format, digits = digits, ...), "%")
}
dat10$statelabel <- paste(dat10$demand_state, "\n", percent(dat10$perc,2,"f"), sep="")
head(dat10)

p9 <- ggplot() +
geom_map(data=states, map=states,
aes(x=long, y=lat, map_id=region),
fill="#ffffff", color="#ffffff", size=0.15) +
geom_map(data=dat10, map=states,
aes(fill=perc, map_id=states),
color="#ffffff", size=0.15) +
coord_fixed(1.3) +
scale_fill_continuous(low = "thistle2", high = "darkred", guide="colorbar") +
geom_text(data=dat10, aes(x=long, y=lat, label=statelabel), colour="black", size=4 ) +
ggtitle("Customers from 11/1/2015 to 10/31/2016") + ylab("") + xlab("") +
theme(plot.title = element_text(face = "bold", size = 20, hjust = 0.5)) +
theme(axis.text.x = element_text(face = "bold", size = 14)) +
theme(axis.text.y = element_text(face = "bold", size = 14)) +
theme(axis.title.x = element_text(face = "bold", size = 16)) +
theme(strip.text.x = element_text(face = "bold", size = 16)) +
theme(axis.title.y = element_text(face = "bold", size = 16, angle=90)) +
guides(fill=FALSE)

#########################################################
## World Maps
#########################################################
## regi_geo
data <- ddply(user_regi, .(regi_geo), summarise, tot=length(user_id))
summary(data$tot)
tmp=joinCountryData2Map(data, joinCode = "ISO2"
, nameJoinColumn = "regi_geo"
, verbose='TRUE'
)
tmp$tot[is.na(tmp$tot)]=0
# catMethod='categorical'
mapCountryData(tmp, nameColumnToPlot="tot",catMethod="fixedWidth")

#getting class intervals
classInt <- classIntervals(tmp[["tot"]], n=5, style = "jenks")
catMethod = classInt[["brks"]]
#getting colours
colourPalette <- brewer.pal(5,'RdPu')
#plot map
mapParams <- mapCountryData(tmp
,nameColumnToPlot="tot"
,addLegend=FALSE
,catMethod = catMethod
,colourPalette=colourPalette )
#adding legend
do.call(addMapLegend
,c(mapParams
,legendLabels="all"
,legendWidth=0.5
,legendIntervals="data"
,legendMar = 2))

tmp2=data.frame(tmp[['regi_geo']], tmp[['tot']], tmp[['NAME']])
tmp2=tmp2[order(tmp2$tmp...tot...,decreasing = T),]
write.csv(tmp2, "junk.csv")

Monday, November 28, 2016

Image Recognition in Python

Classification using Deep Learning

The training phase for an image classification problem has 2 main steps:

Feature HoG and SIFT are examples of features used in image classification.
Extraction: In this phase, we utilize domain knowledge to extract new features that will be used by the machine learning algorithm.
Model Training: In this phase, we utilize a clean dataset composed of the images' features and the corresponding labels to train the machine learning model.

In the predicition phase, we apply the same feature extraction process to the new images and we pass the features to the trained machine learning algorithm to predict the label.

The main difference between traditional machine learning and deep learning algorithms is in the feature engineering. In traditional machine learning algorithms, we need to hand-craft the features. By contrast, in deep learning algorithms feature engineering is done automatically by the algorithm. Feature engineering is difficult, time-consuming and requires domain expertise. The promise of deep learning is more accurate machine learning algorithms compared to traditional machine learning with less or no feature engineering.

Artificial neurons are inspired by biological neurons, and try to formulate the model explained above in a computational form. An artificial neuron has a finite number of inputs with weights associated to them, and an activation function (also called transfer function). The output of the neuron is the result of the activation function applied to the weighted sum of inputs. Artificial neurons are connected with each others to form artificial neural networks.

Feedforward Neural Networks

Feedforward Neural Networks are the simplest form of Artificial Neural Networks.

These networks have 3 types of layers: Input layer, hidden layer and output layer. In these networks, data moves from the input layer through the hidden nodes (if any) and to the output nodes.

Below is an example of a fully-connected feedforward neural network with 2 hidden layers. "Fully-connected" means that each node is connected to all the nodes in the next layer.

Note that, the number of hidden layers and their size are the only free parameters. The larger and deeper the hidden layers, the more complex patterns we can model in theory.

Activation Functions

Activation functions transform the weighted sum of inputs that goes into the artificial neurons. These functions should be non-linear to encode complex patterns of the data. The most popular activation functions are Sigmoid, Tanh and ReLU. ReLU is the most popular activation function in deep neural networks.

Training Artificial Neural Networks

The goal of the training phase is to learn the network's weights. We need 2 elements to train an artificial neural network:

Training data: In the case of image classification, the training data is composed of images and the corresponding labels.

Loss function: A function that measures the inaccuracy of predictions.

Once we have the 2 elements above, we train the ANN using an algorithm called backpropagation together with gradient descent (or one of its derivatives).

Convolutional Neural Networks

Convolutional neural networks are a special type of feed-forward networks. These models are designed to emulate the behaviour of a visual cortex. CNNs perform very well on visual recognition tasks. CNNs have special layers called convolutional layers and pooling layers that allow the network to encode certain images properties.

Convolution Layer

This layer consists of a set of learnable filters that we slide over the image spatially, computing dot products between the entries of the filter and the input image. The filters should extend to the full depth of the input image. For example, if we want to apply a filter of size 5x5 to a colored image of size 32x32, then the filter should have depth 3 (5x5x3) to cover all 3 color channels (Red, Green, Blue) of the image. These filters will activate when they see same specific structure in the images.

Pooling Layer

Pooling is a form of non-linear down-sampling. The goal of the pooling layer is to progressively reduce the spatial size of the representation to reduce the amount of parameters and computation in the network, and hence to also control overfitting. There are several functions to implement pooling among which max pooling is the most common one. Pooling is often applied with filters of size 2x2 applied with a stride of 2 at every depth slice. A pooling layer of size 2x2 with stride of 2 shrinks the input image to a 1/4 of its original size.

Convolutional Neural Networks Architecture

The simplest architecture of a convolutional neural networks starts with an input layer (images) followed by a sequence of convolutional layers and pooling layers, and ends with fully-connected layers. The convolutional layers are usually followed by one layer of ReLU activation functions.

The convolutional, pooling and ReLU layers act as learnable features extractors, while the fully connected layers acts as a machine learning classifier. Furthermore, the early layers of the network encode generic patterns of the images, while later layers encode the details patterns of the images.

Note that only the convolutional layers and fully-connected layers have weights. These weights are learned in the training phase.

Caffe Overview

Caffe is a deep learning framework developed by the Berkeley Vision and Learning Center (BVLC). It is written in C++ and has Python and Matlab bindings.

There are 4 steps in training a CNN using Caffe:

Step 1 - Data preparation: In this step, we clean the images and store them in a format that can be used by Caffe. We will write a Python script that will handle both image pre-processing and storage.

Step 2 - Model definition: In this step, we choose a CNN architecture and we define its parameters in a configuration file with extension .prototxt.

Step 3 - Solver definition: The solver is responsible for model optimization. We define the solver parameters in a configuration file with extension .prototxt.

Step 4 - Model training: We train the model by executing one Caffe command from the terminal. After training the model, we will get the trained model in a file with extension .caffemodel.

After the training phase, we will use the .caffemodel trained model to make predictions of new unseen data. We will write a Python script to this.

Friday, November 25, 2016

Basic Github

Typical development workflow
- developer established a local environment
- developer initializes git local environment
- developer clones common repo to create a local version
- on the local repo, the developer creates a branch to work within
- developer creates new code files or changes existing files while syncing with remote trunk
- developer commits any new code and changes within the branch container
- developer pushes the new branch to the project's remote repo
- developer performs a pull request
- new code is determined either need additional work or is deemed acceptable
- If accepted, new code is merged into remote trunk

1 Create the remote repository, and get the URL such as

git@github.com:/youruser/somename.git or https://github.com/youruser/somename.git

If your local GIT repo is already set up, skips steps 2 and 3

2 Locally, at the root directory of your source,
git init

3 Locally, add and commit what you want in your initial repo (for everything,
git add . then
git commit -m 'initial commit comment')
git clone git@github.com:githubteacher/**********.git

to attach your remote repo with the name 'origin' (like cloning would do)

git remote add origin git@github.kohls.com:tkmaemd/XXX.git

Execute
git pull origin master to pull the remote branch so that they are in sync.

to push up your master branch (change master to something else for a different branch):
git remote -v

git push origin master

git pull --rebase origin master
git push -u origin master

git filter-branch -f --index-filter 'git rm --cached --ignore-unmatch users.csv'

### Git workflow

Untracked, unstaged, staged

### Basic workflow
git status
git add .
git status
git commit --m ‘update models’
git status
git pull origin master
git push origin master

ls -lah
git branch dev
git checkout dev
git add .
git status
git commit -a -m ‘***’
git checkout master
git pull origin master
git merge dev
git push origin master

### More advanced workflow

mkdir log_update
cd log_update
git init
git clone ……git_test.repo.git
cd git_test.repo.git
git branch
git branch log_update
git branch
git checkout log_update
git status
git add .
git commit —-m ‘create new features’

git checkout master
git pull
git pull
git checkout login_upgrade
git merge master —-m ‘merging new login’
git push
# click ‘compare and pull request’ to ask for reviews
# click ‘merge pull request’

linghduoduo

Stat828828

Github tricks

#1 在GitHub.com上编辑代码
#2 粘贴图片
#3 美化代码 https://github.com/github/linguist/blob/fc1404985abb95d5bc33a0eba518724f1c3c252e/vendor/README.md
```jsx
```
#4 在PRs中巧妙关闭issues - https://help.github.com/articles/closing-issues-using-keywords/
#5 链接到评论 - 点击评论框用户名旁边的时间，就可以得到链接了
#6 链接到代码 - 打开一个文件，点击代码左边的行号，或者按住shift选择多行。
#7 灵活使用GitHub地址栏 - 你想跳转到一个分支，看下它与主干的区别，就可以直接在你仓库的后面入/compare/branch-name：与主干对比，两个分支对比，输入/compare/**integration-branch...**my-branch
#8 创建复选框列表
#9 在GitHub中进行项目管理 - https://help.github.com/articles/searching-issues-and-pull-requests/
#10 GitHub wiki - https://github.com/davidgilbertson/about-github/wiki
#11 静态博客 - https://github.com/davidgilbertson/about-github
#12 用GitHub作为CMS(内容管理系统) -
https://www.npmjs.com/package/marked
https://chrome.google.com/webstore/detail/octotree/bkhaagjahfmjljalopjnoealnfndnagc?hl=en-US

% Day 1

% 1 Configuration
git --version

git config --global user.email "user email"

git config --global user.email "***@kohls.com"

git config --global user.name "user name"
git config --global user.name "***"
git config --local user.name "user name"
git config --local user.name "***"

git config -- local user.email "user email"
git config --local user.email "***@kohls.com"

mkdir 'google trend api'
touch index.html
touch index.css
touch about-us.html
touch about-us.css

Pull Requests

% 2 Initializing a Repository in an Existing Directory
-- Initialize the local directory as a Git repository.

git init

-- Add the files in your new local repository. This stages them for the first commit.git add ind*
-- Git Commit moves files from staging area into local repo's history. Commit command creates a new version/snapshot of the project in the repo

git status
git add .
git status
git commit --m “add about page with css.”

git log
git show 9065
git config --global alias.lg log --online --decorate --graph --all -10
git config --global alias.lg “log”

specify some files git ignore for configuration
touch .gitignore
node-modules/

subl index.html
git add .
git status
git commit --m "edit some info to the index page"
subl about-us.html
git commit --am "add some info for about-us"

-- Generating a new SSH key
ssh-keygen -t rsa -b 4096 -C "***@kohls.com"

-- Adding your SSH key to the ssh-agent
eval "$(ssh-agent -s)"

-- Adding your SSH key to the ssh-agent
ssh-add ~/.ssh/id_rsa

-- Adding a new SSH key to your GitHub account
pbcopy < ~/.ssh/id_rsa.pub

-- At the top of your GitHub repository's Quick Setup page, click to copy the remote repository URL.

-- Check remote origin
git remote -v

git push origin master
-- In Terminal, add the URL for the remote repository where your local repository will be pushed.

git remote add origin (remote repository url)

git remote add origin git@github.kohls.com:tkmaemd/Google-Trend-API.git

clear

git log
git config --global alias.lg "log --decorate --grahp --all -10"

% 3 Change file names
git mv index.html home.htm
git status
git commit --m "rename index.html to home.htm"

change index.css --> home.css
git status
git add .
git status
git add -A

% 4 Branching (Add & Delete)
Branch is a version of a project removed from master
Merge is a code from a branch 'merged' with code in the master version

git branch cart
touch cart.htm
touch cart.css
git commit -am "add info in the cart file"
git lg
git status
git add .
git lg
ls
git checkout cart
git branch

git checkout master
git merge cart
git branch
git branch -d cart
git branch _D cart

% 5 Diff
git diff
git diff --staged
git diff --stat
git diff --color_words

% 6 Reset & Conflicts
git status
git reset --hard
git status

git checkout master
git branch
git merge history
git diff --star

% 7 Clone; Pull & Push
git clone https://github.com/githubteacher/******.git
rm poetry

## generate the public/private ssh key
cd ~/.ssh
ssh-keygen -t rsa -C "**********@gmail.com"
cat ~/.ssh/id_rsa.pub > my.txt

git clone git@github.com:githubteacher/**********.git
git clone git@github.kohls.com:tkmaemd/Google-Trend-API.git
git commit --m "my new addition to the poetry"

git remote add origin git@github.kohls.com:tkmaemd/Trend-Report.git

Push the changes in your local repository to GitHub.git remote add origin git@github.com:git@github.com:*********/**********.git
git push -u origin master

git pull origin master
or
git branch --set-upstream-to=origin/master master
git pull

git clone git@github.com:linghduoduo/Test.git
%git remote add origin git@github.com:git@github.com:*********/**********.git
git remote set-url origin git@github.com:git@github.com:*********/**********.git
git push -u origin master

% Day 2;
ls
git status
git init yahoo2
cd yahoo2
ls -a
git config --global --edit
touch index.html
touch index.css
touch about_us.html
touch about_us.css
git add in*
git status
git commit -m "create index/home page for webs"
git status
git commit -am "added about us page"
git status
git add about_us.html
git status
git reset
git add .
git status
git commit -m "add about us page"
git status

git branch
git branch cart
git branch
git checkout cart
git branch
git branch -d cart
git checkout master
git branch -d cart
git branch
git checkout cart
git checkout -b cart
git branch
touch cart.html
touch cart.css
git add .
git status
git commit -m "First cut of shopping cart"
touch cart.js
git status
git add .
git commit
git status

git config --global alias.lg2 "log --oneline --decorate --all --graph -30"
git lg2
git checkout master
subl index.html
"Here is our phone number 555-555-5555"
git status
git commit -am "Added phone number to the home page"
git lg2
git checkout cart
subl cart.html
"Finish shoping cart"
git status
git commit -am "Finished up the shopping cart file"
git status
git checkout master
git merge cart
git lg2
git branch
git branch -d cart
git branch

%create branch under branch
git checkout -b contact
git branch
touch contact.html
touch contact.css
git commit -am "add first cut of contact"
touch contact.js
git add .
git commit -m "add java sctript"
git branch
git checkout -b contact_coffee
git mv contact.js contact.coffee.js
git commit -m "rename javascript to java script"
subl contact.coffee.js
git add .
git commit -m "implement coffee version script"
git branch
%git branch -D contact_coffee
git merge contact_coffee
git merge contact_coffee --no-ff
git branch -d contact_coffee
git checkout master

git remote add origin
git push -u origin master
git remote add origin git@github.com:**********/yahoo2.git

git branch
git branch -a
git pull
git checkout master
git branch
git pull
git lg2

git push
git status
git pull
git status

git config --global push.default simple
git push
toucn index.html
sub index.html
"and fax is XXXXXXX"
git add .
git commit -m "add fax number"
git pull
git push

%checkout
git status
sutl index.html
"Mess up"
git reset --hard
git status
sutl index.html
"Mess up"
sutl index.css
"Mess up"
git checout --index.html
git diff
git commit -am "clean index.css"
git lg2
ls
git checkout 45a6
cat index.html
git checkout master
git checkout

%merge
git branch
git merge contact
git lg2
git branch -d contact
git lg2
git show XXXXX
git merge test --no-ff

git checkout -b test2
subl index.html
git add .
git commit -m "change index"
subl index.html
git add .
git commit -m "change index"
git lg
git branch
git merge test2
%conflict in html
git difftool --tool-help
git help difftool
git status
git mergetool -t opendiff
git mergetool -t vimdiff
git difftool -t vimdiff
git status
subl index.html
%actual merge manually

%Q&A
subl .git/config
git remote -v
subl .git/config

git branch
git branch -d test2
git branch
git branch -a
%remote master: remotes/orgin/master
git pull
git lg
%fetch info from github git remote add <name> <url>
git branch -r
git branch -a

mkdir student
cd student
%copy other people's code
git clone https://github.com/PeterBell/yahoo2.git
git lg
touch myfile.text
git commit -m "my new life"
cd ..
%fork & clone
pwd
cd yahoo
git clone https://github.com/PeterBell/yahoo2.git
cd ../../yahoo3
touch per5143.txt
subl per5143.txt
git add .
git commit -m "perter's commit to the yahoo3 project"
subl .git/config
pull
git remote add pertermaster http://github.com
git push pertermaster
git pull pertermaster

%stash
cd ./../yahoo2
git status
git push
username
password
clear
git lg
git stash list
subl index.html
"Our address is 1 NY Plaza"
git status
subl index.css
"add new css"
git commit -am "fix css"
git push
username
password
git stash list
git stash pop
git status
git add .
git commit -m "
subl index.html
git stash list
git stash
subl contact.coffee.js
"add new cofffee fiel"
git stash
git stash pop
git stash list
git stash apply
subl index.html
git diff
git stash list
git branch
git checkout -b test3
git status
git stash pop
git stash list
git reset --hard
git status
git stash pop
git commit -am "make a change to the home page"
subl contact.html
"thist is the new phone"
git stash
git stash list
gti stash apply
git commit -m "add new phone"
git checkout status
git branch test3
git lg
git stash list
git reset --hard
git status
git commit -am "added field to the contact form"
git checkout
git stash pop
git diff
git commit -m "added new field to contact us form"
git lg
git stash list

%changing history of log
touch user.txt
git add .
git commit -m "list of usersss"
git lg
git commit --amend
%interactive mode, fix list of uesrs
ls
touch store.html
subl store.html
"store locator is link"
git status
git commit -am "added new store locator and a link page"
git status
git add .
git status
git commit --amend
%interactive mode, fix the error
git status
git lg
git show XXXXX
git lg
ls
touch store.css
touch store.js
git add .
git commit -m "added styling and js to store locator"
git lg
git diff
git reset --soft HEAD~1
git lg
git status
git commit -m "added store locator and link"
%git reset --hard HEAD~1
git status
git commit -m "undid the list"
git reset --hard
git revert
git reflog
git checkout master
git branch
git branch -D bad_code
git show XXXXXXX
git checkout -b good_code XXXXXXXXXX
git reflog
subl index.html
git status
git reset --hard
git lg

Git Basics

# Edit file
vi joke.txt
git diff
git commit –a
git status
# Add new file
vi new.txt
git add new.txt
git commit
git status
# Remove file
git rm new.txt
git commit
git status
# Move file
git mv old.txt new.txt
git commit
git status

Daily workflow

# Get latest and greatest code from origin
git checkout master
git pull
# Create a new workspace
git checkout –b bug1234
# Fix bug 1234 and commit changes
vi bugfix.txt
git commit –a
# Back to master to sync with origin
git checkout master
git pull
# Back to workspace to fold in latest code
# Rebase upstream changes into my downstream branch
git checkout bug1234
git rebase master
# Validate my change against latest stable code
run unittest.txt
# Ready to send downstream changes to master
# Merge my workspace and master so they have identical commits
git checkout master
git merge bug1234
# Push my downstream changes up to origin
git push
# Delete my workspace
git branch –d bug1234

# Unstage changes
git reset [file]
# Undoes all changes
git reset --hard [commit]
# Revert a single file
git checkout -- [file]

# Revert to a commit
revert -n [commit]
# Diff options
git diff [commit] [commit]
git diff master:file branch:file
git diff HEAD^ HEAD
git diff master..branch
git diff --cached
git diff --summary
git diff --name-only
git diff --name-status
git diff -w # ignore all whitespace
git diff --relative[=path] (run from subdir or set path)
# Log|Shortlog options
# --author=jenny, --pretty=oneline, --abbrev-commit,
# --no- merges, --stat, --since, --topo-order|--date-order
git log -- <filename # history of a file, deleted too
git log dir/ # commits that modify any file under dir/
git log test..master # commits on master but not test
git log master..test # commits on test but not master
git log master...test # commits on either test or master
# but not both
git log -S'foo()' # commits that add or remove any file data
# matching the string 'foo()'
git show :/fix # last commit w/"fix" in msg
# Compare master vs branch
git diff master..branch
git diff master..branch | grep "^diff" # changed files only
git shortlog master..branch
git show-branch
git whatchanged master..mybranch
git cherry –v <upstream [<head] #commits not merged upstream
git config core.autocrlf input
git config core.safecrlf true
git config --global push.default tracking # only push current
# Sync branch to master
git checkout master
git pull
# Clean up previous commits before sending upstream
git rebase -i HEAD~n
git rebase -i master mybranch
# Pull requests/tracking branches
[git remote add -f foobar git://github...] # set up remote
git branch --track newbranch foobar/whichbranch
# Push to remote branch
git push [remote] HEAD:[remote-branch]
git push origin HEAD
git push origin :branch (delete remote branch)
# Stashing
git stash list
git stash show -p stash@{2}
git stash [pop|apply] stash@{@2}
git stash drop stash@{2}
# Merge upstream changes with WIP
git stash save "Log msg."
git [pull|rebase master]
git stash apply
# Merge files from another branch into master
git checkout master
git checkout feature path/to/file path/to/another/file
# Copy commit from another branch
git cherry-pick –x [commit] # -x appends orig commit message
# Branching
git branch [-a | -r]
git checkout -b newbranch
git branch -d oldbranch
git branch -m oldbranch newbranch
# Interrupt WIP with quick fix
git stash save "Log msg."
vi file;
git commit -a
git stash pop
# Test incremental changes to a single file
git add --patch [file]
git stash save --keep-index "Log msg."
[test patch]
git commit
git stash pop
...repeat...

Tuesday, November 22, 2016

IPython and Using Notebooks

Python is an open source platform for interactive and parallel computing. It started with the realization that the standard Python interpreter was too limited for sustained interactive use, especially in the areas of scientific and parallel computing.

For OS X users it’s usually recommended that using package managers such as MacPorts and Homebrew instead, installing Python therein, and avoiding using the system Python. A better solution for novices is to install an independent Python distribution, including Anaconda and Enthought.

Installing Anaconda

Anaconda is a free distribution of Python packages distributed by Continuum Analytics. Conda can be used for package management as well as environment management.

bash Anaconda3-4.3.0-MacOSX-x86_64.sh

Installing Homebrew

Homebrew is a missing package management tool.

ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"

To check for any issues with the install run

brew doctor

To search for an application:

brew search

To install an application:

brew install <application-name>

To list all apps installed by Homebrew

brew list

To remove an installed application

brew remove <application-name>

To update Homebrew

brew update

To see what else you can do

man brew

If /usr/local/Library/LinkedKegs seems to contain a list of, well, linked kegs, so this should do the trick:

ls -1 /usr/local/Library/LinkedKegs | while read line; do

echo $line

brew unlink $line

brew link --force $line

done

Installing Python

conda create -n py36 python=3.6 anaconda
source activate py36

Installing IPython

conda install ipython

IPython comes with a test suite called iptest.

iptest

Updating Python

All-in-one distributions -

When pip and easy-install are not enough, both Anaconda and Canopy have their own built-in package management systems.

Anaconda provides a powerful command-line tool called conda. conda can be used for package management as well as environment management. Every program runs in an environment that includes the version of Python, IPython, and all included packages.

conda update conda

conda update python

Python requires v3.

Install Python Packages

pip list
pip install -upgrade
pip install -r requirements.txt

To activate the environment

cd /Users/tkmaemd/anaconda/envs/py35/bin
source activate py35
ipython

To deactivate environment

cd /Users/tkmaemd/anaconda/envs/py35/bin
source deactivate py35

Shell integration
ipython
In[1]
Out[2]
?map
??map

Magic commands

OS equivalents: %cd, %env, and %pwd
Working with code: %run, %edit, %save, %load, %load_ext, and %%capture
Logging: %logstart, %logstop, %logon, %logoff, and %logstate
Debugging: %debug, %pdb, %run, and %tb
Documentation: %pdef, %pdoc, %pfile, %pprint, %psource, %pycat, and %%writefile
Profiling: %prun, %time, %run, and %timeit
Working with other languages: %%script, %%html, %%javascript, %%latex, %%perl, and %%ruby

Installing R in Jupyter

1 installing via supplied binary packages

install.packages(c('repr', 'IRdisplay', 'evaluate', 'crayon', 'pbdZMQ', 'devtools', 'uuid', 'digest')) devtools::install_github('IRkernel/IRkernel')

2 Making the kernel available to Jupyter

IRkernel::installspec()

3 install basic R packages by conda

conda install -c r r-essentials

Extra for magic commands

With magic commands, IPython becomes a more full-featured development environment. A development session might include the following steps:

Set up the OS-level environment with the %cd, %env, and ! commands.
Set up the Python environment with %load and %load_ext.
Create a program using %edit.
Run the program using %run.
Log the input/output with %logstart, %logstop, %logon, and %logoff.
Debug with %pdb.
Create documentation with %pdoc and %pdef.

This is not a tenable workflow for a large project, but for exploratory coding of smaller modules, magic commands provide a lightweight support structure.

Some observations are in order:

Note that the function is, for the most part, standard Python. Also note the use of the !systeminfoshell command. You can freely mix both standard Python and IPython in IPython.
The name of the function will be the name of the line magic.
The line parameter contains the rest of the line (in case any parameters are passed).
A parameter is required, although it need not be used.
The Out associated with calling this line magic is the return value of the magic.
Any print statements executed as part of the magic are displayed on the terminal but are not part of Out (or _).

Debug example

x=0

1/x

%debug
h
(help)
w
(where am i)
p x
(print)
q
(drop debugger)
%pdb

A full complement of commands is available for navigation:

u/d for moving up/down in the call stack.
s to step into the next statement. This will step into any functions.
n to continue execution until the next line in the current function is reached or it returns. This will execute any functions along the way, without stopping to debug them.
r continues execution until the current function returns.
c continues execution until the next breakpoint (or exception).
j <line> jumps to line number <line> and executes it. Any lines between the current line and <line> are skipped over. The j works both forward and reverse.

And handling breakpoints:

b for setting a breakpoint. The b <line> will set a breakpoint at line number <line>. Each breakpoint is assigned a unique reference number that other breakpoint commands use.
tbreak. This is like break, but the breakpoint is temporary and is cleared after the first time it is encountered.
cl <bpNumber> clears a breakpoint, by reference number.
ignore <bpNumber> <count> is for ignoring a particular breakpoint for a certain number (<count>) of times.
disable <bpNumber> for disabling a breakpoint. Unlike clearing, the breakpoint remains and can be re-enabled.
enable <bpNumber> re-enables a breakpoint.

Examining values:

a to view the arguments to the current function
whatis <arg> prints the type of <arg>
p <expression> prints the value of <expression>
Matering IPython 4.0

Chapter 1 Using Python for HPC

High Performance Computing

API allowed people to store data on those machines (the Amazon Simple Storage Service, or S3) and an API allowed people to run programs on the same machines (the Amazon Elastic Compute Cloud, or EC2). Together, these made up the start of the Amazon Cloud.

Fortran provided answers to problems of readability, portability, and efficiency within the computing environments that existed in early machines. How Python/IPython, while not originally designed for runtime efficiency, takes these new considerations into account.

Chapter 2 Advanced Shell Topics

IPython beyond Python

There are too many magic commands to go over in detail, but there are some related families to be aware of:

OS equivalents: !ls, %cd, %env, and %pwd
Working with code: %run, %edit, %save, %load, %load_ext, and %%capture
Logging: %logstart, %logstop, %logon, %logoff, and %logstate
Debugging: %debug, %pdb, %run, and %tb
Documentation: %pdef, %pdoc, %pfile, %pprint, %psource, %pycat, and %%writefile
Profiling: %prun, %time, %run, and %timeit
Working with other languages: %%script, %%html, %%javascript, %%latex, %%perl, and %%ruby

Terminal Python
stdin&stdout
Python execution
JSON
IPython Kernel

Chapter 3 Stepping Up to IPython for Parallel Computing

Serial Processes

Program counters and address spaces
Batch systems
Multitasking (Cooperative multitasking / Preemptive multitasking) and preemption

Threading

Threading in Python

Limitations of threading

Global Interpreter Lock

Using multiple processors

The IPython parallel architecture

Getting started with ipyparallel

Parallel magic commands

Types of parallelism

Data Parallelism

Application steering

Sunday, November 20, 2016

WGSN Insight

WSGN Product Breakdown - 23 Categories

Automotive
Bed & Bath
Colour
Consumer Electronics
Decorative Accessories
Experience Design
Fashion Connection
Food & Drink
Furniture & Lighting
Garden & Outdoor
Hospitality
Interior Style

Kids’ Room
Kitchen & Tabletop
Materials & Surfaces
Paper & Packaging
Pets
Print & Pattern
Seasonal Gifting
Textiles
Vintage & Craft
Walls & Floors
Wellness

Insight - Transformative consumer and market intelligence
- In-depth insight into the consumer of today and tomorrow.
- Complete coverage of trends in retail, consumer markets and marketing.
- Global team of top industry experts and on-the-ground trend hunters.
- Original content with fresh perspectives to spark outside-the-box thinking.

Fashion - The world's #1 fashion trend forecaster.
- Enhance your planning with color and trend forecasts 2+ years ahead.
- Get inspired by more than 22m images and thousands of royalty free CADs and designs.
- Drive sales by staying on-trend with over 250 new reports each month.
- Save half a day every week with our productivity tools and city guides.

Lifestyle & Interiors - trend service for the consumer lifestyle and interiors industry.
- Plan ahead with color and trend reports, with specific edits for interiors.
- Develop inspired design with in-depth content in 23 sections, from automotive to wellness
- Drive revenue by staying on-trend with over 50 new, in-depth market reports each month.
- Save time with our trade show summaries, so you don't have to be there.

Instock - The high data analytics platform for critical retail decisions.
- Make faster buying and merchandising decisions with access to a daily feed of e-commerce data.
- Understand you market and product position with more than 12,000 brands and retailers analyses.
- Make smarter trading decisions with regular stock drop reports and more than 100m retail SKUs monitored and analyses.
- Improve range planning by analyzing competitor data by color, price and product mix.

Styletrial - Rapid consumer feedback to improve buying, merchandising and pricing.
- Reduce investment risk by testing new product and packaging ideas before you go to market.
- Improve certainty of buying and merchandising decisions by accessing millions of US and UK consumers.
- Ensure alignment of price and target audience to your product offering.
- Make more rapid decisions by receiving actionable feedback with result with five days.

Mindset - Tailored trend consulting by world-class experts.
- Improve your strategy by accessing our dedicated team of market and consumer insight specialists.
- Hone your brand proposition based on a tailor-made interpretation of current fashion and lifestyle trends.
- Improve the performance of your products and your team with our innovation workshops.
- Enhance your retail or trade show offer with our tailor-made trend zones and retail edits.

WGSN Future Key Takeaways

An experiential and innovative environment

Experimentation is key for retailers to be successful in the future, and needs to be built into business models. Over the course of a short period, the business was able to experiment with a number of emerging technologies, and was able to show in a live situation what worked and what didn’t.

Jeun Ho Tsang, the co-founder of London-based experimental store laboratory The Dandy Lab, explained 83% of retailers are failing to innovate. Jeun Ho Tsang said RFID loyalty cards proved successful as it meant the business knew what its customers had seen previously and their colour preferences. This enabled it to create a better relationship with them. Mobile payments also had a significant uptake, with 42% of shoppers signing up. Customers initially use the app to pay, and then continued to do so on repeat visits to scan and find out more about items.

The Role of the Store

Shumacher said that by 2020, 39% of purchases will be influenced by omnichannel, and the move to an experience economy means changing how we view the store. Many stores won’t sell product in 20 years’ time, but they will remain one of the most important components of the brand experience. Where today, physical retail’s success is down to sales, success metrics in the store of the future will be things like customer experience per square metre, active participation, social interaction, and how well retailers have staged the product. Retailers need to understand what a good brand relationship is, how what that involves is changing, and how consumer expectations are rising. Brands that are doing well are those with a “really strong sense of purpose” which aligns with that of the customer base. “A brand purpose is useful, and a short cut to an emotional relationship,” said Betmead. “What matters is that you care about something that they care about.”

Instagram Stories: Brand Narratives
Instagram gives users across the globe an insider's look into celebrity, fashion, luxury and more.
A new brand created a story to announce a new product offering.The first clip was to draw viewers in, provoking curiosity and enticing them to keep watching. Next, it featured images of the product (with clever use of emoji). This type of announcement allows audiences to feel as if they were let in on a secret of sorts, likely increasing the audience's receptiveness of the brand.

Consumer Attitudes

Chinese Millennials

As the importance of lifestyle continues to grow in China, active and semi-fine jewelry has emerged as key retail categories to watch. Currently considered as affordable fast-fashion accessory or a serious financial investment, contemporary-level jewelry is still a relatively new concept for the Chinese consumer.

The Lonely Generation
Social, community-focused shopping experiences have become a retail priority for brands operating in China as a way to attract the consumers. In line with the global importance of hybrid lifestyle stores, women-only concepts and in-store coffee shops are on the rise as retailers serve to fulfill the human need to connect.
On social media channels such as Instagram and Weibo, photo captions and hashtags feature phrases such as #lonely, #lonelytodeath, #lonelyphotgrapher and #lonelygroumand.

Self-Obsessed
Following an operating room selfie scandal and a story about changing room exhibitionists, the generation is now beginning to challenge the own digital narcissism as they become more self-aware and strive for personal improvement.

Economy as Culture
Driven by the need to provide for the past and the future generations, personal financing has become a form of pop culture among Millennials. Popular hobbies include investing in stocks and venturing into entrepreneurship.

Tech Entrepreneurs
Tech has emerged as a key industry to watch for sustaining the country's future financial growth. Expect to see this group grow among Generation Z as favorable entrepreneurial policies are set to roll out in higher education institutions in the near future.

Urban Inspiration
Urban centers in China represent a dream for a better quality of life for Millennial consumers. As an iconic symbol for innovation and opportunity, the relationship with the city serves an important inspiration for art and contemporary films.

Women-only Socials
Gender-specific group chats focusing on personal and professional support have also been growing to fulfill the need for a safe, growth-oriented community. This generation's interest in feminism is no the rise.

Digital Experience
The most popular topics on the platform include WeChat e-commence, health, red envelopers, travel, humor and mobile phone costs.

Fashion is a multi-billion dollar industry with social and economic implications worldwide. The fashion industry has traditionally placed high value on human creativity and has been slower to realize the potential of data analytics. With the advent of modern cognitive computing technologies (data mining and knowledge discovery, machine learning, deep learning, computer vision, natural language understanding etc.) and vast amounts of (structured and unstructured) fashion data the impact on fashion industry could be transformational. Already fashion e-commerce portals are using data to be branded as not just an online warehouse, but also as a fashion destination. Luxury fashion houses are planning to recreate physical in-store experience for their virtual channels, and a slew of technology startups are providing trending, forecasting, and styling services to fashion industry.

Cold Start Analysis

Increase the duration of moving window.

Develop hierarchy of keyword groups and calculate PTQS

Infer PTQS from partner attributes

The hierarchy structure can deal with the cold-start and smoothing to some extent.

Isotonic regression in scikit-learn

http://tullo.ch/articles/speeding-up-isotonic-regression/

Isotonic regression is a useful non-parametric regression technique for fitting an increasing function to a given dataset.

A classic use is in improving the calibration of a probabilistic classifier. Say we have a set of 0/1 data-points (e.g. ad clicks), and we train a probabilistic classifier on this dataset.

Unfortunately, we find that our classifier is poorly calibrated - for cases where it predicts about 50% probability of a click, there is actually a 20% probability of a click, and so on.

With a trained isotonic regression model, our final output is the composition of the classifiers prediction with the isotonic regression function.

For an example of this usage, see the Google Ad Click Prediction - A View from the Trenches paper from KDD 2013, which covers this technique in section 7. The AdPredictor ICML paper paper also uses this technique for calibrating a Naive Bayes predictor.

We'll now detail how we made the scikit-learn implementation of isotonic regression more than ~5,000x faster, while reducing the number of lines of code in the implementation.

The nature of a conversion event can vary widely across advertisers. Conversion events can be defined by: submission of ac completed form, a purchase event, subscribing a service, etc. Each of these has different intrinsic conversions rates.

A partner generates traffic from several websites, which may vary widely in traffic quality. Source tag may be a more natural granularity, however, source tag are susceptible to manipulation.

Classified and structured match

Product match

Domain match

Optimal Frequency

1 Introduction

First transaction after running EM campaign really counts.

Determine the optimal frequency and impose a sensible cap, which enable us decrease cost per sale.

This study focuses on optimal frequency from a direct response standpoint, namely, how to increase the efficiencies of a campaign to deliver leads and sales. After analyzing campaign data, we are able to look into the impact of frequency on redemptions and sales.

2 The wrong path to optimal frequency

A consumer who redeemed after the third mails, but subsequently receives other mails. Attributing the redemptions to the total mails would grossly overestimate the level at which consumer redeemed.

We are able to identify the most common frequency level prior to redemptions. Since the vast majority of mails from consumers who never redeem are ignored, realistic redemption rates are impossible to estimate.

3 The right path to optimal frequency

Cumulative redemption rates reveal the true optimal frequency level. By looking at cumulative mails and redemptions, a model is crated of how redemptions are harvested with each incremental mail. In effect, this methodology simulates what would have happened had the campaign been frequency capped at different levels.

The redemption rate on the first email was the highest, though the first three** all had at least 100% lift on average. At any given moment there are only a fraction of consumers who will immediately respond to your solicitation. Thus, a direct marketing campaign’s performance will depend on its ability to maximize reach at the optimal frequency level and boost the frequency of consumers that have only seen a few mails.

4 The most efficient frequency vs. the most profitable frequency

The frequency level with the highest response rate may not necessarily be the same frequency level to maximize your profits. There will always be a trade-off advertisers have to manage between efficiency and volume. Restricting frequency to only one email per consumer might achieve a lowest possible cost-per response, but you may end up with a very low total number of responses.

5 What this means for marketers

It is important for advertisers to recognize and react to the amount of money being wasted on excessive high frequency users. The culprits are not the consumers who consume for, five, or six mails, but rather the thousands of consumers who receive hundreds of mails without any response. We suggest basic frequency caps. Imagine what a frequency cap could mean when a consumer receives 1000 emails – a cap at 10 emails would drive reach to at least 100 additional potential customers.

It is significant knowing how various caps will impact campaign performance, monitoring where gross waste is significant, understanding whether negotiated caps are truly in effect, which will help in planning and buying media more intelligently.

Quantify the trade-offs between frequency levels and response rates.

Quantify how much pricing premiums for capped inventory are actually worth.

Strategically pick frequency levels that maximize total response yields, while still meeting the cost per response goals.

Identify customer purchase frequency increased as a result of a specific marketing campaign.

Identify email campaign assisting direct mail campaign for in-store purchase.

Typical visit frequency

Consider the case of a user who deletes cookies every day.

If a particular site has a group of very addicted uses who return frequently, even if a small number delete cookies daily, the result will significantly inflated numbers of cookies relative to the number of actual people who visited the site. In such cases it is not unusual to see an average of two or more cookies for every user over the courses of a month.

Of course, most users only result in one cookie, but a small number generate many cookies.

Controlled experiments have been done by economists and social scientists to show the effect of user pass exposure. While the results from the studies differ in different scenarios, largely it has been shown that with increased past exposure the user is more like to response positively. Intuitively, past exposure to an ad might help in several ways such as increased brand awareness and familiarity with the product, or even an increase probability to the user to notice the ad. At the same time, some studies have also show an “ad fatigue” effect where users might tired of an ad if is displayed too often.

Some site tend to have mostly passers-by, i.e., visitors that only go to the site once over a given time period. No impact on the total number of cookies for the site.

There exist some addicted users who visit particular sites frequently, which leads to significantly inflated frequency numbers of addicted users relative to the number of average people who visited the sites.

The frequency distribution of user visits RON is skewed, that is, a small portion of users are frequent visitors, while the remaining are infrequent visitors. Hence, the sample size available to estimate item affinity per user is small for a large number of users.

Saturday, December 31, 2016

Tuesday, December 27, 2016

Friday, December 23, 2016

Thursday, December 8, 2016

Wednesday, December 7, 2016

Monday, November 28, 2016

Classification using Deep Learning

Feedforward Neural Networks

Activation Functions

Training Artificial Neural Networks

Convolutional Neural Networks

Friday, November 25, 2016

Tuesday, November 22, 2016

Sunday, November 20, 2016

Blog Archive