**Question**

**What is the use of VLOOKUP and how do we use it?**

Answer

The function VLOOKUP in Excel is used to look up information in a table and extract the corresponding data.

Syntax: VLOOKUP (value, table, col_index, [range_lookup])

value – Indicates the data that you are looking for in the first column of a table. (This should always be to the left of the column from where you want to retrieve the corresponding value).

table – refers to the set of data (table) from which you have to retrieve the above value.

col_index – Refers to the column in the table from where you are to retrieve the value.

range_lookup – FALSE = exact match [optional] TRUE = approximate match (default).

Shown below is an example of the VLOOKUP function:

**Question**

**How is VLOOKUP different from the LOOKUP function?**

Answer

VLOOKUP | LOOKUP |

VLOOKUP lets the user look for a value in the left-most column of a table. It then returns the value in a left-to-right way. | Meanwhile, the LOOKUP function enables the user to look for data in a row/column. It returns the value in another row/column. |

It is not very easy to use as compared to the LOOKUP function. | It is easier and can also be used to replace the VLOOKUP function. |

**Question**

**How does the IF() function in Excel work?**

Answer

In Excel, the IF() function performs a logical test. It returns a value if the test evaluates to true and another value if the test result is false. It returns the value depending on whether the condition is valid for the entire selected range.

Let’s look at the below example:

As seen above, the IF function returns “Record is Valid” if age is greater than 20, and the salary should be greater than $40000. Else, it will return “Record is Invalid”. Here the final answer will be “Record is Valid” as the entire selected range qualifies both the conditions.

**Question**

**How do you perform a horizontal lookup in Excel?**

Answer

To perform a horizontal lookup, you will have to make use of the HLOOKUP function.

SYNTAX:

*HLOOKUP(lookup_value, table_array, row_index_num, [range_lookup])*

here,

- lookup_value gives the value to be looked out for
- table_index is the range from where the data is to be taken
- row_index_num specifies the row from which you want to fetch the value
- range_lookup is a logical value i.e TRUE or FALSE (TRUE will find the closest match; FALSE checks for exact match)

EXAMPLE:

**Question**

**What are the different types of errors you can encounter in Excel?**

Answer

When working with Excel, you can encounter the following six types of errors:

**#N/A Error**: This is called the ‘Value Not Available’ error. You will see this when you use a lookup formula and it can’t find the value (hence Not Available).- #
**DIV/0! Error****:**You’re likely to see this error when a number is divided by 0. This is called the division error. - #
**VALUE! Error**: The value error occurs when you use an incorrect data type in a formula. **#REF! ERROR**: This is called the reference error and you will see this when the reference in the formula is no longer valid. This could be the case when the formula refers to a cell reference and that cell reference does not exist (happens when you delete a row/column or worksheet that was referred in the formula).**#NAMEERROR:**This error is likely to a result of a misspelled function.**#NUM ERROR**: Number error can occur if you try and calculate a very large value in Excel. For example, =194^643 will return a number error.

**Question**

**What are the known limitations of the VLOOKUP function?**

Answer

The VLOOKUP function is mighty useful, but it also has a few limitations:

- It cannot be used when the lookup value is on the right. For VLOOKUP to work, the lookup value should always be in the left-most column. Now this limitation can be overcome by using it with other formulas, it tends to make formulas complex.
- VLOOKUP would give a wrong result if you add/delete a new column in your data (as the column number value now refers to the wrong column).
*You can make the column number dynamic, but if you planning to combine two or more functions, why not use INDEX/MATCH in the first place.* - When used on large data sets, it can make your workbook slow.

**Question**

**What is the difference between Vlookup and Index Match function?**

Answer

Vlookup | Index Match |

Vlookup can only be used for looking up values from Left to Right | Index Match can lookup the values from Left to Right as well as Right to Left |

Vlookup only can lookup through vertical lines i.e, columns and not through rows | Index Match can lookup values through rows as well as columns |

Vlookup has limit for lookup_value size. It should not be exceeding 255 characters | Index Match does not possess any limit of size or length for matching values. |

Vlookup is easy to understand as well as remember in comparison with Index match | Index Match is hard to understand as well as remember due to its complex nature that combines two functions/ formula together |

**Question**

**What is Relative cell referencing in Excel?**

Answer

Relative cell referencing is used when dealing with formulas in Excel. If you write a sum formula to add the values of a set of cells (e.g A4 to A8) together, it will look like this: =SUM(A4:A8). If you use relative cell references, then when you copy this formula to a different section of the spreadsheet, the cells will change relative to where the formula has been pasted. For example, if you copy the formula across one column, the formula will become =SUM(B4:B8).

**Question**

**What is Absolute cell referencing ?**

Answer

Absolute cell referencing is the exact opposite of relative cell referencing. By marking the row number and column letter with a $ symbol, you can make a cell reference fixed (or “absolute”). This means that when you copy and paste it to another cell or use AutoFill, the cell references will not change. The formula =SUM(A4:A8) will stay as =SUM(A4:A8) no matter where you put it.

**Question**

**What is the difference between COUNT , COUNTA, COUNTBLANK AND COUNTIF in Excel?**

Answer

- COUNT: This function counts how many cells within a specified range contain numerical data. It will ignore (not count) any cells that are blank or contain text or symbols only.
- COUNTA: This function counts how many cells within a specified range contain data of any type. It will count all cells that are not blank.
- COUNTBLANK: This function will count the number of blank cells within the designated range.
- COUNTIF: This function will count only the cells whose value meets a certain condition specified by the user.

**Question**

**How would you highlight cells with duplicate values in it?**

Answer

You can do this easily using conditional formatting. Here are the steps:

- Select the data in which you want to highlight duplicate cells.
- Go to the Home tab and click on Conditional Formatting option.
- Go to Highlight Cell Rules and click on ‘Duplicate Values’ option.

**Question**

**What is a Pivot Table?**

**Question**

**What is Data Validation?**

Answer

Data Validation restricts the type of values that a user can enter into a particular cell or a range of cells.

In the Data tab, select the ‘Data Validation’ option present under Data Tools.

Select the kind of data validation you want to apply.

**Question**

**How do you Hyperlink in Excel?**

Answer

To create a link in Excel, select the element you wish to use as the anchor (this can be a cell or an object like a picture). You can then either select Link from the Insert tab, right-click and select Link on the menu, or press Ctrl+K. This will bring up a variety of options that will allow you to indicate what kind of content you would like to link to, such as a file, a web page, a specific location, or an email address.

**Question**

**What is a Solver?**

Answer

Solver in Excel is an add-in that allows you to get an optimum solution when there are many variables and constraints. You can consider it to be an advanced version of Goal Seek.

With Solver, you can specify what the constraints are and the objective that you need to achieve. It does the calculation in the back-end to give you a possible solution.

**Question**

**What are macros in Excel? Create a macro to automate a task**.

Answer

Macro is a program that resides within the Excel file. The use of it is to automate repetitive tasks that you would like to perform in Excel.

To record a macro, you can either go to the Developer tab and click on Record Macro or access it from the View tab.

**Question**

**What is R programming?**

Answer

R is a language and environment for statistical computing and graphics. It provides a wide variety of statistical (linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering, …) and graphical techniques, and is highly extensible. It is an integrated suite of software facilities for data manipulation, calculation and graphical display. It includes:

- an effective data handling and storage facility,
- a suite of operators for calculations on arrays, in particular matrices,
- a large, coherent, integrated collection of intermediate tools for data analysis,
- graphical facilities for data analysis and display either on-screen or on hardcopy, and
- a well-developed, simple and effective programming language which includes conditionals, loops, user-defined recursive functions and input and output facilities.

**Question**

**Differentiate between Vector, List, Matrix and Dataframe.**

Answer

A **vector** is a series of data elements of the same basic type. The members in the vector are known as a component.

The R object that contains elements of different types such as numbers, strings, vectors, or another list inside it, is known as **List**.

A two-dimensional data structure used to bind the vectors from the same length, known as the **matrix**. The matrix contains the same types of elements.

A **Data** frame is a generic form of a matrix. It is a combination of lists and matrices. In the Data frame, different data columns contain different data types.

**Question**

**Give any 5 features of R.**

Answer

5 features of R are:

- Simple and effective programming language.
- It is a data analysis software.
- It gives an effective storage facility and data handling.
- It gives high extensible graphical techniques.
- It is an interpreted language.

**Question**

**What are the advantages and disadvantages of R?**

Answer

Advantages of R are:

- Open Source
- Data Wrangling
- Array of Packages
- Platform Independent
- Machine Learning Operations

Disadvantages of R are:

- Weak origin
- Data Handling
- Basic Security
- Complicated Language
- Lesser Speed

**Question**

**Give the command to create a histogram and to remove a vector from the R workspace.**

Answer

hist() is the command to create a histogram, where you can specify the details by typing hist(v,main,xlab,xlim,ylim,breaks,col,border).

- v is a vector containing numeric values used in histogram.
- main indicates the title of the chart.
- col is used to set the color of the bars.
- border is used to set the border color of each bar.
- xlab is used to give a description of x-axis.
- xlim is used to specify the range of values on the x-axis.
- ylim is used to specify the range of values on the y-axis.
- breaks is used to mention the width of each bar.

rm() is used to remove a vector from the R workspace.

**Question**

**Why do we use apply() function in R?**

Answer

This is used to apply the same function to each of the elements in an Array. For example, finding the mean of the rows in every row.

**Question**

**How do you create a vector in R?**

Answer

To create a vector in R, you have to use the <- symbol to assign a name to a vector. For example if you want to store the values 4 5 8 14 as a vector in x, you will have to type the command: x<-c(4,5,8,14)

**Question**

**Explain the different functions that can be applied for Normal distribution in R.**

Answer

The different functions that can be applied for normal distribution in R are as follows:

dnorm(x, mean, sd)

pnorm(x, mean, sd)

qnorm(p, mean, sd)

rnorm(n, mean, sd)

Following is the description of the parameters used in above functions −

- x is a vector of numbers.
- p is a vector of probabilities.
- n is the number of observations(sample size).
- mean is the mean value of the sample data. Its default value is zero.
- sd is the standard deviation. Its default value is 1.

**Question**

**Explain the different functions that can be applied for Binomial distribution in R.**

Answer

The different functions that can be applied for Binomial distribution in R are as follows:

dbinom(x, size, prob)

pbinom(x, size, prob)

qbinom(p, size, prob)

rbinom(n, size, prob)

Following is the description of the parameters used −

- x is a vector of numbers.
- p is a vector of probabilities.
- n is the number of observations.
- size is the number of trials.
- prob is the probability of success of each trial.

**Question**

**What is the main difference between an array and a matrix?**

Answer

A matrix is always two-dimensional as it has only rows and columns. But an array can be of any number of dimensions and each dimension is a matrix. For example, a 332 array represents 2 matrices each of dimension 33.

**Question**

**How can you load and use a CSV file in R?**

Answer

A CSV file can be loaded using the read.csv function. R creates a data frame on reading the CSV files using this function.

**Question**

**How do you get the name of the current working directory in R?**

Answer

The command getwd() gives the name of the current working directory in R.

**Question**

**How do you install a package in R?**

Answer

To install a package in R, you need to give the following command:

install.packages(“package name”)

**Question**

**What is the output of runif(6)?**

Answer

runif(6) generates 6 random numbers from a uniform distribution between 0 and 1.

**Question**

**Give the R command to get the probability of getting 26 or less heads from 51 tosses of a coin using pbinom.**

Answer

The R command to get the probability of getting 26 or less heads from a 51 tosses of a coin using pbinom is:

x<-pbinom(26,51,0.5)

print(x)

The first command obtains the required probability and stores the value in x. The second command, ie., print(x) prints or shows the value of x.

**Question**

**Give the commands to obtain the mean, median and mode of a dataset.**

Answer

The command for obtaining the mean of a dataset is: mean(…)

The command for obtaining the median of a dataset is: median(…)

The command for obtaining the mode of a dataset is: mode(…)