For high compression efficiency, 3-D video coding usually employs a multimode methodology to exploit the dependencies between multiple views as well as between texture and depth. However, different coding modes will posses differentiating error propagation behaviour when the compressed 3-D video bit stream is transmitted over packet-switched networks, and thus lead to different amount of visual distortions. Further, the texture and depth distortions are combined in a highly complex fashion to produce the overall view synthesis distortion. To minimize the expected view synthesis distortion, this paper proposes an efficient rate-distortion optimized algorithm for joint selection of texture and depth modes. Firstly, a statistical model is developed to estimate the overall view synthesis distortion, in which the channel distortions caused by error propagation under different coding modes are analyzed. Then, joint optimization of texture and depth modes is derived within an operational rate-distortion framework using the Lagrange multiplier method. The adjacent block dependency caused by warping operation is explicitly considered in optimization, for which we develop a dynamic programming method to find the optimal solution. Finally, we extend the Lagrange minimization method to the more general variable-block-size prediction case, where the optimal quadtree tree structure and the combined coding modes are jointly determined using a multi-level dual trellis. Experimental results are presented for a wide range of packet loss rates to illustrate the effectiveness of the proposed algorithm.