Skip to content

Translated using GPU and variable #202

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
62 changes: 22 additions & 40 deletions SOURCE/how_tos/using_gpu/index.md
Original file line number Diff line number Diff line change
@@ -1,25 +1,18 @@
# Using GPUs <a class="md-anchor" id="AUTOGENERATED-using-gpus"></a>
# 使用GPU <a class="md-anchor" id="AUTOGENERATED-using-gpus"></a>

## Supported devices <a class="md-anchor" id="AUTOGENERATED-supported-devices"></a>
## 支持的设备 <a class="md-anchor" id="AUTOGENERATED-supported-devices"></a>

On a typical system, there are multiple computing devices. In TensorFlow, the
supported device types are `CPU` and `GPU`. They are represented as
`strings`. For example:
在经典的系统中,会有多个计算设备。在TensorFlow内,支持的设置类型是`CPU`和`GPU`,它们被表示为 `字符串`。例如:

* `"/cpu:0"`: The CPU of your machine.
* `"/gpu:0"`: The GPU of your machine, if you have one.
* `"/gpu:1"`: The second GPU of your machine, etc.
* `"/cpu:0"`: 你机器上的CPU.
* `"/gpu:0"`: 你机器上的GPU,如果你有一个的话.
* `"/gpu:1"`: 你机器上的第二个 GPU,等等.

If a TensorFlow operation has both CPU and GPU implementations, the
GPU devices will be given priority when the operation is assigned to
a device. For example, `matmul` has both CPU and GPU kernels. On a
system with devices `cpu:0` and `gpu:0`, `gpu:0` will be selected to run
`matmul`.
如果一个TensorFlow操作同时有CPU和GPU的实现,当这个操作配置到一个设备时GPU将会得到优先权。例如,`matmul` 同时有CPU和GPU的核,在一个有设备`cpu:0`和`gpu:0`的系统上,`gpu:0`将会被选择来执行`matmul`。

## Logging Device placement <a class="md-anchor" id="AUTOGENERATED-logging-device-placement"></a>
## 记录设备布置 <a class="md-anchor" id="AUTOGENERATED-logging-device-placement"></a>

To find out which devices your operations and tensors are assigned to, create
the session with `log_device_placement` configuration option set to `True`.
要找出你的操作和张量分配到哪些设备,用`log_device_placement`设置为`True`来创建session。

```python
# Creates a graph.
Expand All @@ -32,7 +25,7 @@ sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
print sess.run(c)
```

You should see the following output:
你将看到如下输出:

```
Device mapping:
Expand All @@ -46,12 +39,9 @@ MatMul: /job:localhost/replica:0/task:0/gpu:0

```

## Manual device placement <a class="md-anchor" id="AUTOGENERATED-manual-device-placement"></a>
## 人工设置布置 <a class="md-anchor" id="AUTOGENERATED-manual-device-placement"></a>

If you would like a particular operation to run on a device of your
choice instead of what's automatically selected for you, you can use
`with tf.device` to create a device context such that all the operations
within that context will have the same device assignment.
如果你想在你选择的设备而不是自动选择的设备上执行某个特定的操作,你可以用`with tf.device` 来创建一个设备上下文这样所有在这个环境内的操作将拥有相同的设备配置。

```python
# Creates a graph.
Expand All @@ -65,7 +55,7 @@ sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
print sess.run(c)
```

You will see that now `a` and `b` are assigned to `cpu:0`.
你将看到现在`a``b`被分配到`cpu:0`

```
Device mapping:
Expand All @@ -78,11 +68,9 @@ MatMul: /job:localhost/replica:0/task:0/gpu:0
[ 49. 64.]]
```

## Using a single GPU on a multi-GPU system <a class="md-anchor" id="AUTOGENERATED-using-a-single-gpu-on-a-multi-gpu-system"></a>
## 在多GPU系统上使用单个GPU <a class="md-anchor" id="AUTOGENERATED-using-a-single-gpu-on-a-multi-gpu-system"></a>

If you have more than one GPU in your system, the GPU with the lowest ID will be
selected by default. If you would like to run on a different GPU, you will need
to specify the preference explicitly:
如果在你的系统里有多个GPU,具有最小ID的一个将会被默认选中。如果你想在其他GPU上执行,你需要显示地指定参数。

```python
# Creates a graph.
Expand All @@ -96,8 +84,7 @@ sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
print sess.run(c)
```

If the device you have specified does not exist, you will get
`InvalidArgumentError`:
如果你已指定的设备不存在,你将得到一个 `InvalidArgumentError`:

```
InvalidArgumentError: Invalid argument: Cannot assign a device to node 'b':
Expand All @@ -106,10 +93,7 @@ Could not satisfy explicit device specification '/gpu:2'
values: 1 2 3...>, _device="/gpu:2"]()]]
```

If you would like TensorFlow to automatically choose an existing and
supported device to run the operations in case the specified one doesn't
exist, you can set `allow_soft_placement` to `True` in the configuration
option when creating the session.
如果当指定的设备不存在时你想TensorFlow自动选择一个已存在且支持的设备去执行操作,你可以创建session时在配置选项里设置 `allow_soft_placement` 为 `True` 。

```python
# Creates a graph.
Expand All @@ -125,11 +109,10 @@ sess = tf.Session(config=tf.ConfigProto(
print sess.run(c)
```

## Using multiple GPUs <a class="md-anchor" id="AUTOGENERATED-using-multiple-gpus"></a>
## 使用多CPU <a class="md-anchor" id="AUTOGENERATED-using-multiple-gpus"></a>

If you would like to run TensorFlow on multiple GPUs, you can construct your
model in a multi-tower fashion where each tower is assigned to a different GPU.
For example:
如果你想在多个GPU上执行TensorFlow,你可以构造一个每一层都分配到不同GPU的多层结构。
例如:

```
# Creates a graph.
Expand All @@ -147,7 +130,7 @@ sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
print sess.run(sum)
```

You will see the following output.
你将看到如下输出:

```
Device mapping:
Expand All @@ -170,5 +153,4 @@ AddN: /job:localhost/replica:0/task:0/cpu:0
[ 98. 128.]]
```

The [cifar10 tutorial](../../tutorials/deep_cnn/index.md) is a good example
demonstrating how to do training with multiple GPUs.
[cifar10 tutorial](../../tutorials/deep_cnn/index.md)是一个演示怎样在多GPU上训练的好例子。
138 changes: 44 additions & 94 deletions SOURCE/how_tos/variables/index.md
Original file line number Diff line number Diff line change
@@ -1,29 +1,18 @@
# Variables: Creation, Initialization, Saving, and Loading <a class="md-anchor" id="AUTOGENERATED-variables--creation--initialization--saving--and-loading"></a>
# 变量: 创建、 初始化、 保存和恢复 <a class="md-anchor" id="AUTOGENERATED-variables--creation--initialization--saving--and-loading"></a>

When you train a model, you use [variables](../../api_docs/python/state_ops.md)
to hold and update parameters. Variables are in-memory buffers containing
tensors. They must be explicitly initialized and can be saved to disk during
and after training. You can later restore saved values to exercise or analyse
the model.
当你训练一个模型, 你用 [variables](../../api_docs/python/state_ops.md) 保持和更新参数。变量是内存中容纳张量的。它们必须被显示的初始化且在训练中和训练后能保存到磁盘。你后续可以恢复保存的值来练习或分析这个模型。此文档涉及以下TensorFlow类,请随它们参考手册的链接,以获得对它们API完整的描述:

This document references the following TensorFlow classes. Follow the links to
their reference manual for a complete description of their API:

* The [`tf.Variable`](../../api_docs/python/state_ops.md#Variable) class.
* The [`tf.train.Saver`](../../api_docs/python/state_ops.md#Saver) class.
* [`tf.Variable`](../../api_docs/python/state_ops.md#Variable) .
* [`tf.train.Saver`](../../api_docs/python/state_ops.md#Saver) .


## Creation <a class="md-anchor" id="AUTOGENERATED-creation"></a>
## 创建 <a class="md-anchor" id="AUTOGENERATED-creation"></a>

When you create a [Variable](../../api_docs/python/state_ops.md) you pass a
`Tensor` as its initial value to the `Variable()` constructor. TensorFlow
provides a collection of ops that produce tensors often used for initialization
from [constants or random values](../../api_docs/python/constant_op.md).
当你创建一个 [Variable](../../api_docs/python/state_ops.md) 你传入
`Tensor` 作为它的初始值到 `Variable()` 构造器. TensorFlow 提供来自 [constants or random values](../../api_docs/python/constant_op.md)的生成常用于初始化的张量的ops集合。

Note that all these ops require you to specify the shape of the tensors. That
shape automatically becomes the shape of the variable. Variables generally
have a fixed shape, but TensorFlow provides advanced mechanisms to reshape
variables.
请注意所有的这些ops要求你指定张量的形状。那个形状会自动的变成变量的形状。变量通常有一个固定的形状,但是TensorFlow 提供先进的机制去改变变量的形状。

```python
# Create two variables.
Expand All @@ -32,29 +21,21 @@ weights = tf.Variable(tf.random_normal([784, 200], stddev=0.35),
biases = tf.Variable(tf.zeros([200]), name="biases")
```

Calling `tf.Variable()` adds several ops to the graph:
调用 `tf.Variable()` 添加若干操作到图:

* A `variable` op that holds the variable value.
* An initializer op that sets the variable to its initial value. This is
actually a `tf.assign` op.
* The ops for the initial value, such as the `zeros` op for the `biases`
variable in the example are also added to the graph.
* 一个持有变量值的 `variable` 操作 .
* 一个设置变量到其初始值的初始化操作。这实际上是一个 `tf.assign` 操作.
* 初始化的操作,如示例中 `biases` 变量的 `zeros` 操作也被添加到图中

The value returned by `tf.Variable()` value is an instance of the Python class
`tf.Variable`.
The value returned by `tf.Variable()`返回的值是Python类`tf.Variable`的一个实例。

## Initialization <a class="md-anchor" id="AUTOGENERATED-initialization"></a>
## 初始化 <a class="md-anchor" id="AUTOGENERATED-initialization"></a>

Variable initializers must be run explicitly before other ops in your model can
be run. The easiest way to do that is to add an op that runs all the variable
initializers, and run that op before using the model.
变量的初始化器必须在你的模型中其他操作执行前显式的被执行。简单的做法就是添加一个执行所有变量初始化器的操作,在使用模型之前执行这个操作。

You can alternatively restore variable values from a checkpoint file, see
below.
你可以选择性的从一个断点文件中恢复变量的值,见如下。

Use `tf.initialize_all_variables()` to add an op to run variable initializers.
Only run that op after you have fully constructed your model and launched it in
a session.
用`tf.initialize_all_variables()` 添加一个执行变量初始化的操作。只在你完整地构建完你的模型且在sesson中启动它之后执行这个操作。

```python
# Create two variables.
Expand All @@ -74,16 +55,11 @@ with tf.Session() as sess:
...
```

### Initialization from another Variable <a class="md-anchor" id="AUTOGENERATED-initialization-from-another-variable"></a>
### 从其它变量初始化 <a class="md-anchor" id="AUTOGENERATED-initialization-from-another-variable"></a>

You sometimes need to initialize a variable from the initial value of another
variable. As the op added by `tf.initialize_all_variables()` initializes all
variables in parallel you have to be careful when this is needed.
你有时需要用其他变量来初始化一个变量的值。当一个操作需要被 `tf.initialize_all_variables()` 初始化器并行地添加所有变量时,您必须注意。

To initialize a new variable from the value of another variable use the other
variable's `initialized_value()` property. You can use the initialized value
directly as the initial value for the new variable, or you can use it as any
other tensor to compute a value for the new variable.
其他变量通过`initialized_value()` 属性用其值初始化一个新的变量。你可以直接用初始化值作为新变量的初始值,也可以用它作为任何其他张量来计算新变量的值


```python
Expand All @@ -96,36 +72,26 @@ w2 = tf.Variable(weights.initialized_value(), name="w2")
w_twice = tf.Variable(weights.initialized_value() * 0.2, name="w_twice")
```

### Custom Initialization <a class="md-anchor" id="AUTOGENERATED-custom-initialization"></a>
### 自定义初始化 <a class="md-anchor" id="AUTOGENERATED-custom-initialization"></a>

The convenience function `tf.initialize_all_variables()` adds an op to
initialize *all variables* in the model. You can also pass it an explicit list
of variables to initialize. See the
[Variables Documentation](../../api_docs/python/state_ops.md) for more options,
including checking if variables are initialized.
这个便利的方法 `tf.initialize_all_variables()` 添加一个操作初始化模型里的 *所有变量*。你也可以显式的传入变量列表来初始化。参照[Variables Documentation](../../api_docs/python/state_ops.md) 以获得更多选项,包括检查变量是否被初始化。

## Saving and Restoring <a class="md-anchor" id="AUTOGENERATED-saving-and-restoring"></a>

The easiest way to save and restore a model is to use a `tf.train.Saver` object.
The constructor adds `save` and `restore` ops to the graph for all, or a
specified list, of the variables in the graph. The saver object provides
methods to run these ops, specifying paths for the checkpoint files to write to
or read from.
## 保存和恢复 <a class="md-anchor" id="AUTOGENERATED-saving-and-restoring"></a>

### Checkpoint Files <a class="md-anchor" id="AUTOGENERATED-checkpoint-files"></a>
保存和恢复一个模型的最简单的方式是用一个 `tf.train.Saver` 对象。

Variables are saved in binary files that, roughly, contain a map from variable
names to tensor values.
构造器为图中所有或者特定的列表的变量添加 `save` 和 `restore` 操作到图。保存者对象提供方法执行这些操作,为检测点文件指定路径去写入或读取。

When you create a `Saver` object, you can optionally choose names for the
variables in the checkpoint files. By default, it uses the value of the
[`Variable.name`](../../api_docs/python/state_ops.md#Variable.name) property for
each variable.
### 检测点文件 <a class="md-anchor" id="AUTOGENERATED-checkpoint-files"></a>

### Saving Variables <a class="md-anchor" id="AUTOGENERATED-saving-variables"></a>
变量保存在二进制文件中,这些文件大致包含从变量名到张量值的映射。

Create a `Saver` with `tf.train.Saver()` to manage all variables in
the model.
一旦你创建一个 `Saver` 对象,你可以随意地为在检测点文件中的变量选择名称。默认的,它使用每个变量的[`Variable.name`](../../api_docs/python/state_ops.md#Variable.name)属性的值。

### 保存变量 <a class="md-anchor" id="AUTOGENERATED-saving-variables"></a>

用 `tf.train.Saver()` 创建一个 `Saver` 去管理所有在模型中的变量;

```python
# Create some variables.
Expand All @@ -149,10 +115,9 @@ with tf.Session() as sess:
print "Model saved in file: ", save_path
```

### Restoring Variables <a class="md-anchor" id="AUTOGENERATED-restoring-variables"></a>
### 恢复变量 <a class="md-anchor" id="AUTOGENERATED-restoring-variables"></a>

The same `Saver` object is used to restore variables. Note that when you
restore variables from a file you do not have to initialize them beforehand.
同样的 `Saver` 用来恢复变量。注意的是,当你从一个文件恢复变量时,你不必提前初始化它们。

```python
# Create some variables.
Expand All @@ -172,38 +137,23 @@ with tf.Session() as sess:
...
```

### Choosing which Variables to Save and Restore <a class="md-anchor" id="AUTOGENERATED-choosing-which-variables-to-save-and-restore"></a>
### 选择保存和恢复的变量 <a class="md-anchor" id="AUTOGENERATED-choosing-which-variables-to-save-and-restore"></a>

If you do not pass any argument to `tf.train.Saver()` the saver handles all
variables in the graph. Each one of them is saved under the name that was
passed when the variable was created.
如果你不传入任何参数到 `tf.train.Saver()` ,saver 操纵所有在图中的变量。它们全部都会按创建时传入的名称被保存。

It is sometimes useful to explicitly specify names for variables in the
checkpoint files. For example, you may have trained a model with a variable
named `"weights"` whose value you want to restore in a new variable named
`"params"`.
有时候显式地为检测点文件中的变量指定名称很有用的。例如,你可能训练了一个拥有变量名为`"weights"`的变量的模型,你想把这个变量恢复到一个变量名为`"params"`的变量中。

It is also sometimes useful to only save or restore a subset of the variables
used by a model. For example, you may have trained a neural net with 5 layers,
and you now want to train a new model with 6 layers, restoring the parameters
from the 5 layers of the previously trained model into the first 5 layers of
the new model.
有时候只保存或恢复一个模型用到的变量的一个子集也是很有用的。例如,你可能训练了一个5层的神经网络,你现在想训练一个新的6层的模型,从之前训练好的5层网络模型中恢复参数到新模型的前5层。

You can easily specify the names and variables to save by passing to the
`tf.train.Saver()` constructor a Python dictionary: keys are the
names to use, values are the variables to manage.
你可以轻松地通过向`tf.train.Saver()` 构造器传入一个Python字典(键是要用的名称,值是需要处理的变量)来指定要保存的名称和变量。

Notes:
注意:

* You can create as many saver objects as you want if you need to save and
restore different subsets of the model variables. The same variable can be
listed in multiple saver objects, its value is only changed when the saver
`restore()` method is run.
* 你想创建多少 saver 对象就创建多少,如果你需要保存和恢复模型中变量的不同子集。同样的变量可以在多个 saver 对象中出现,它的值只会在 saver 的 `restore()` 方法执行时改变。

* If you only restore a subset of the model variables at the start
of a session, you have to run an initialize op for the other variables. See
[`tf.initialize_variables()`](../../api_docs/python/state_ops.md#initialize_variables)
for more information.
* 如果你在启动一个 session 时,只恢复了模型中变量的一个子集,你必须为其他变量执行一个初始化操作。参照
[`tf.initialize_variables()`](../../api_docs/python/state_ops.md#initialize_variables)
获得更多信息。

```python
# Create some variables.
Expand Down