0% found this document useful (0 votes)
19 views3 pages

DBT Interview Questions and Answers

The document presents a series of interview questions related to dbt (data build tool), covering topics such as Jinja expressions, the purpose of profiles.yml, the dbt run command, materializations, incremental models, testing commands, snapshot strategies, and data integrity practices. Each question includes multiple-choice answers that assess the interviewee's knowledge of dbt functionalities and best practices. This document serves as a resource for evaluating candidates' understanding of dbt in a data engineering context.
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views3 pages

DBT Interview Questions and Answers

The document presents a series of interview questions related to dbt (data build tool), covering topics such as Jinja expressions, the purpose of profiles.yml, the dbt run command, materializations, incremental models, testing commands, snapshot strategies, and data integrity practices. Each question includes multiple-choice answers that assess the interviewee's knowledge of dbt functionalities and best practices. This document serves as a resource for evaluating candidates' understanding of dbt in a data engineering context.
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

DBT INTERVIEW

1- Which of the following is a correct Jinja expression?​

A: {{ my_variable }}​ ​ ​ ​ ​

B: { my_variable }

C: [ my_variable ]

D: ( my_variable )

​ ​ ​ ​ ​
​ ​ ​ ​
2- What is the purpose of the [Link] file in dbt?​ ​ ​ ​ ​

A: To store sensitive credentials​ ​ ​ ​ ​

B: To connect to the data warehouse​​ ​ ​ ​ ​

C: To avoid checking sensitive credentials into version control

D: To store project details

​ ​ ​ ​ ​
​ ​ ​ ​
3- As a data developer working with dbt, what happens when you execute the "dbt run"
command, and how does dbt build the model data warehouse?​ ​ ​

A: The "dbt run" command creates a temporary view in the source database and pulls data into
the target data warehouse.​
B: The "dbt run" command triggers a Python script that runs a series of SQL queries on the
source database to extract and transform data.​ ​ ​

C: The "dbt run" command builds the model data warehouse by wrapping it in a "create view as"
or "create table as" statement.​
D: The "dbt run" command builds the model data warehouse by creating a temporary table in
the target database and loading data into it.

4- What are the four materializations that ship with dbt?​ ​ ​ ​ ​

A: The four materializations that ship with dbt are view, table, incremental, and ephemeral.
There are no additional options for creating custom materializations.​
B: The four materializations that ship with dbt are view, table, incremental, and ephemeral. You
can also create your own custom materializations using advanced features of dbt.​
C: The four materializations that ship with dbt are view, table, incremental, and ephemeral. You
can also create your own custom materializations using SQL.​
D: The four materializations that ship with dbt are view, table, incremental, and ephemeral. You
can also create your own custom materializations by requesting them from dbt support.

5- You are a data scientist working on a data pipeline and you want to use incremental
models in dbt. What information do you need to provide to dbt to use incremental
models?​

A: The model name and the name of the database adapter being used.​ ​ ​ ​

B: The unique key of the model and how to filter the rows on incremental runs.​ ​

C: The path to the SQL file containing the select statement.

D: The SQL query used to create the source data.

​ ​ ​ ​ ​

6- You are using dbt Core or dbt Cloud to run your project, and you want to ensure that
all of your defined models are running correctly and that your tests are passing. What
command should you use?​​

A: dbt run​ ​ ​ ​

B: dbt build​ ​ ​

C: dbt test

D: dbt deploy

​ ​ ​ ​ ​

7- A data team is seeking to implement a dbt snapshot strategy for monitoring changes
in their customer data. Specifically, they need a snapshot strategy that can detect
changes based on a column indicating the time when a row was last updated. What
snapshot strategy would be suitable for this purpose?

A: Check strategy

B: Surrogate key strategy

C: Timestamp strategy
D: None of the above

​ ​ ​ ​ ​

8- You are a data analytics engineer working on a dbt project. You have a model that
involves complex transformations and is built on top of other views. Which
materialization should you use for this model?​

A: Incremental​​ ​ ​

B: Table​ ​ ​ ​

C: View

D: Seed

9- When using sources in dbt, which is the best practice for ensuring data integrity?

A: Running data integrity tests against the data warehouse

B: Running data integrity tests against individual models

C: Ignoring data integrity tests altogether

D: Running data integrity tests against the sources​ ​ ​ ​ ​

E: Performing manual checks on the data

​ ​ ​ ​ ​
10- When using dbt, which of the following types of tests would you expect to be able to
run out- of-the-box to ensure data integrity?​ ​ ​ ​ ​

A: Data duplication

B: Null values in specific columns

C: Conformance to predefined values

D: Referential integrity between tables

​ ​ ​ ​ ​
​ ​ ​ ​
​ ​ ​
​ ​

Common questions

Powered by AI

The four materializations that come with dbt are view, table, incremental, and ephemeral. Users can create their own custom materializations using SQL .

To use incremental models in dbt, you must provide the unique key of the model and specify how to filter the rows on incremental runs. This information ensures that only the new or changed data is processed, optimizing the runtime and reducing resource load .

The correct Jinja expression in a dbt project is {{ my_variable }} .

The profiles.yml file in dbt is used to connect to the data warehouse and to avoid checking sensitive credentials into version control .

A best practice for ensuring data integrity in dbt is running data integrity tests against the sources. This ensures that the data being entered into the pipeline meets quality standards before any downstream processing .

dbt provides out-of-the-box tests for referential integrity between tables, ensuring that relationships defined in the data model are not violated .

For models involving complex transformations built on top of other views, the 'View' materialization should be used. This choice allows dbt to dynamically generate SQL that runs directly in the database, facilitating real-time updates .

Executing the 'dbt run' command builds the model data warehouse by wrapping it in a 'create view as' or 'create table as' statement .

The 'dbt test' command is used to ensure that all defined models and their tests are executed correctly .

The Timestamp strategy is suitable for detecting changes in a dbt snapshot based on a column indicating the last time a row was updated .

You might also like